Document Extraction To Determine The Qualitative Effect Of Important Events
An estimated 1 million new news pieces arrive at Dow Jones every day, adding to its 30-year-plus premium news archive. The company sought to use its new cloud-based content processing and storage infrastructure, called Dow Jones DNA, to enable scalable, flexible access to its 1.3 billion document premium news library, which is among the largest in the world.
Objectives
Our client intensions was to create cloud-based technology process over large volumes of historical and streaming business news documents and find hidden insights by transforming text into named entities (i.e. people, locations, money, and events) and the relationships among them. This was done in order to demonstrate the depth and breadth of the DNA dataset. They discovered, in fact, that these articles might be used as data points to inform changing industry demand in areas like portfolio management, sales, business development, risk target identification, and deal opportunity aggregation, among others.
Solutions
When processing their terabyte-scale, unstructured data corpus, Jaiinfoway created a Knowledge Graph architecture to aid data scientists and developers in learning about network effects and the economic ramifications of uncommon global occurrences like a significant natural disaster. Customers might also see other significant occasions, undiscovered connections, or secret chances that might have an effect on their organisation. In order to perform text mining, machine learning, data integration, and enterprise advanced analytics, the tool makes use of the Google Cloud Platform, the Dow Jones DNA - Data, News & Analytics service, TensorFlow, and a graph database platform.
- Inconsistent file storage and difficulty in maintaining document versions
- Poor quality database for filtering information
- Storage was at a premium. Needed to archive old documents
- Needed information quickly; couldn’t waste time searching for right files and poring over paper forms
Benefits
By working with the software development business Jaiinfoway, the customer was able to receive advantages such as;
- Through version control, they can now look into previous versions of any document very easily.
- File and folder description feature allows them to accurately describe the contents of the folder without making filenames unnecessarily lengthy.
- They are able to create folders that are like databases (Tables) using indexes;
- now all their databases are in one location, under control and act all the same way.
- Features like integrated scanning, PDF printer and PDF assembly help to convert documents to PDF and manipulate them to add annotations, stamp or to highlight important text.
- Folder template feature helps to organize documents better. It helps them to store documents in such a way that when searching for information, they get results quickly.
The Impact
The performance and outcome of the project were both very well received by the client.
The issue was resolved by identifying diseases and issues early on in order to reduce the possibility that individuals would require emergency care.
Industry -: Healthcare
Technology Leveraged
- Google Cloud Platform
- Google’s BigQuery
- Python
- SQL
- javascript
- SaaS