The real SPONGE
Sponge helps you transform the limitless amount of unstructured content into valuable information that can be used everywhere. Web sites, blogs, forums, social sites or documents, Sponge is a full-featured, flexible and extensible crawler that runs on an platform and will help you crawl whatever information you want, how you want it.
Using the Sponge
Sponge will help you do three things: crawl data from web sites, from social sites or documents from file systems. All three crawlers use a common platform that allows you to manage data, crawling jobs and security in a unified way.
Web sites are crawled with configurable HTTP spiders. We provide a very simple web user interface
where users can load any web page and define how to extract their data of interest.
Social Network data is extracted using the specialized APIs made available by each vendor.
We currently support Facebook, Twitter, Google+ and YouTube and are working to integrate more platforms.
bigConnect can ingest and process any kind of information. Whether you have it stored in databases, office documents, text files, XML files, HTML files, images, audio files, video files or different streams, Facebook, Twitter, YouTube, Google+, and web sites, our content extraction engines and crawlers will automatically extract the relevant information as objects, relations and attributes.
Using a unified management console, Sponge allows users to run, schedule and monitor crawling jobs, configure crawling parameters and see the execution progress.
Structured files ingestion
Just drag an XLS or CSV file on the workspace and bigConnect will automatically parse the content and allow you to configure the mappings in our mapping editor.
The content of a workspace, where data is collected, is completely secure and can be shared with others with different access rights. For example you might want some users to see the job status, but restrict access to the actual data that was extracted.
Once the information is collected and processed, it needs to go to the location and format of your choice. A pluggable data output mechanism will send data to bigConnect or other databases like SQL Databases, ElasticSearch, or Solr.
Running crawlers report their status to the management console for easy monitoring of crawling jobs. Users can review job execution status, crawling errors or intermediate data that was collected.
Crawler extracted data can be tagged with additional information, translated, split, merged, trimmed, filtered and more. We have lot of predefined taggers and transformers and users can even run scripts to tag and transform content according to their requirements.
Any website is a valuable source of information. Sponge allows you to easily decode any raw website, by highlighting certain areas of the site and mapping them to the structure you want.
With a user-friendly web interface, data mapping can be done in a couple of clicks. Sponge identifies the unique relevant HTML landmarks using complex and smart algorithms.
Social Networks have a lot of information that can be used for various purposes: social analysis, customer voice, influencers or hot topics.
A lot of valuable data still resides in local content silos: local disks, external drivers, network shares, archives or even HDFS. Sponge includes a crawler that can effectively collect, parse, manipulate and store this information.
The Sponge Users
Sponge is used to collect and correlate information from websites and social networks in order to identify potential threats to national security.
Anti Fraud and Law enforcement
Sponge is used to ingest information from news sites and social profiles to discover relationships between people, actions, facts and locations and interpret them in order to prevent fraud and corruption actions
Banking & Telecom
Sponge can extract information from social networks, forums and blogs to identify the customer voice and the general sentiment regarding customer services. It can also act as an antichurn and prevention tool for strengthening the client install base as well as market feedback receptor.
Social Network Analysis
Sponge extracts complete information from social networks and it is the key data provider for complex social network analysis
By extracting information from relevant web sites, forums, blogs and other public information sources and correlating this information with social network data enables organizations to create complete, 360 degree people profiles