The eESS infrastructure for enterprise search and information logistics is the ideal solution for getting quick, meaningful and personalized search results in your company.
eccenca Enterprise Search Suite is based on the Eclipse Foundation project SMILA, which was initiated and decisively implemented by us. Thus eESS offers a sustainable, globally recognized architecture guaranteed by the Eclipse Foundation.
Key guiding principles in the development of the eccenca Enterprise Search Suite were:
- To ensure sustainability of client investment in a data access architecture
- Flexibility through (parallel) use of architecture with any open and / or closed-source algorithms, e.g. from the fields of linguistics, semantics and searches
- To meet the highest requirements for access control
- Flexible scalability
- Maximum efficiency
It has been demonstrated that each of these objectives could be achieved.
Sustainability, openness, and flexibility
Clients such as Volkswagen AG operate eESS as the standard architecture with various search technologies such as Solr or Google Search Appliance (GSA).
Clients with many specific user rights in different rights dimensions opt for eESS as a platform, if they want to operate search technologies such as Solr or GSA efficiently in accordance with corporate security policies.
With so-called ‘search-driven applications’, the volume of inquiries can quickly arise with more than one million searches per hour along with large volumes of documents. With its different scaling possibilities, eESS is designed particularly for such scenarios.
The cost advantages of open source are obvious because there is no large initial investment to pay for licenses. Instead, a use-oriented contract is finalized for maintenance, support, guarantee and liability exemption. These services are centrally offered and economically implemented by us and are essential for the operation of software in a professional environment.
With an annual integrated usage and maintenance fee per core, eESS provides an appropriate value pricing tailored to the client. Included is not only the complete eccenca Enterprise Search Suite including some professional enhancements of our own development, but also the market-leading document conversion software OutsideIn by Oracle.
1. Back-end Connectivity
- Agent and Connector framework for the development of your own interfaces to your data sources. Connectors supplied together with eESS comprise crawlers for data systems (in general), NTFS (with rights), Web, JDBC, CSV and Confluence.
- A variety of interfaces for widely used data sources such as SharePoint, Lotus Notes, Exchange, SAP, Documentum, EMC², Liferay, etc. are available as commercial components with maintenance commitments and enterprise support.
2. Data Conversion and Processing
- The standard distribution includes Oracle OutsideIn and supports more than 300 document formats incl. MS Word, MS Excel, MS PowerPoint, OpenOffice, PDF, RTF, HTML.
- On request, we can include partner components for the Speech2Text conversion of video and audio files.
3. Linguistic and semantic pre-processing
- With the aid of the OSGI component model, any procedure can very easily be used to enrich or extract data for the analysis of documents.
- When the extended variant of eESS with linguistic components is selected an integrated version of the Rosette® components ‘Base Linguistics’ and ‘Entity Extractor’ developed by Basis Technology is included ‘out of the box’.
- More tools can be integrated with eESS on request! We have a wide-ranging network of highly specialized partners who specialize in the extraction of knowledge from unstructured data. The bandwidth ranges from ‘sentiment analysis’ up to the extraction of specific chemical formulas from texts and images.
- The connection of triplestores and integration with the Linked Data Web is another building block of eESS and is being developed. A sample use scenario is here the automatic entity extraction training with data from a Linked Data graph.
- After the pre-processing generally follows an indexing of the data by the search technology used in each case. Solr and the Google search appliance have already been successfully integrated into eESS.
- Other search technologies can also be integrated on request.
5. Queries and searches
- With the help of pipelining in the search, queries can be freely modified or extended (e.g. by a thesaurus). The same applies to the results before the hit return (e.g. via document clustering).
- The optional User Rights Management expands search queries by right-specific features so that the results are already factually accurate and don’t have to be inefficiently reworked.
- Query parsing and disambiguation using graphs are currently under development.
- Likewise being developed is the connection to the eccenca Knowledge Base through SPARQL queries on triplestrores.
- Integration of data sources via crawlers or use of REST interfaces
- Pipeline concept for data preparation, extraction and enrichment before indexing and after searching
- Delta check for efficient updating of data
- Scaling through asynchronous workflows
- Document conversion to popular file formats (extraction of text and meta information from binary formats)
- Linguistic components for lemmatization and compound words
- Extraction of predefined and own entities
- RDF export data such as e.g. entities
- Mapping of complex authorization concepts
- LDAP connection for use of existing permissions e.g. for file shares
- Synonym thesaurus for automatic expansion of search queries
- Use of various search technologies such as Solr or GSA
- Windows Server 2008/Server 2012/7/8/, various Linux-Distributions (64 Bit)
- Java-Runtime Version 1.7
- 4 GB or more RAM
- 2 GB or more free disc space