Government contract bidding documents are often made up of thousands of pages of heterogeneous and poorly-structured information. It takes teams of specialists from multiple corporate departments to read through and understand these huge files before they can determine whether the company can submit a bid and fulfil the resulting contract. And the deadlines are usually tight. Misunderstandings can lead a company to miss a business opportunity or, worse, commit to a project it cannot complete.
A leading engineering and energy corporation turned to CEA-List, which possesses advanced natural language processing and requirements engineering tools and methods, to identify wording indicative of contractual obligations buried inside the huge volume of information in government contract bidding files to direct the appropriate department’s (civil engineering, general engineering, project management, legal, quality, etc.) attention to these critical paragraphs.
The company and CEA-List started by creating a database with a hierarchy of business concepts associated with key words and expressions. All of the documents written in natural language were indexed in the AMOSE document management system, and then analyzed using LIMA natural language processing software. Lastly, the requirements engineering tools in the MAAT suite were used to analyze sentences indicating contractual obligations and classify them by topic, discipline, and corporate department so that the relevant team could focus on the part of the bidding documents that concerned them specifically.
CEA-List’s first proof-of-concept prototype was tested successfully on two real government contract bids the partner company was working on. The functionalities developed have already been integrated into the CEA’s INCA smart collaborative engineering environment and are available for use by CEA staff.
CEA-List scientists and engineers are now pushing back the limits of natural language processing even further by using the automatic detection of similar requirements for more thorough semantic analysis.