AI learns to read Korean, so you don’t have to

The long arm of American law sparks fear among international companies, even those based thousands of miles away.

So, when a South Korean company received a grand jury subpoena from the US Department of Justice, its lawyers at law firm Hogan Lovells had to react fast.

The South Korean company, which provided services to the US government, was facing serious allegations of price-fixing and bid-rigging behaviour after it became embroiled in a US antitrust investigation.

Obtaining the necessary records was never going to be simple. The lawyers would need to search through millions of potentially relevant documents. With the company facing big potential fines in the US, the stakes were high.

“This was a very large organisation with massive amounts of data and we only had a relatively short time,” says Jeremy Burdge, litigation practice group lead for the Americas at Hogan Lovells. “There were millions and millions of records and it was like trying to find a needle in a haystack to respond to the allegations from the US.”

Stephen Allen, global head of legal operations at the law firm, says this meant that Hogan Lovells in effect had to choose between “hiring every single Korean-speaking attorney in the US” or deploying technology to help narrow the search for key documents.

The challenge of character-based scripts

Artificial intelligence tools can help law firms analyse and sift documents as a lawyer would, but faster and without human errors. These tools can transform legal investigations and acquisition deals by homing in on the most relevant documents, which in turn saves time and money.

But in this case, there was an additional hurdle. Much of the legal AI that has been developed in the US and Europe is based on the data being in the English language — or at least western languages — rather than character-based scripts such as Chinese or Japanese, or other unique writing systems such as Korean.

Mr Burdge then recalled reading an academic paper on “tokenism”, in which a program creates mathematical formulas for certain sets of character or letter combinations. He realised this could be applied to the antitrust investigation’s documents.

By giving every symbol a value and applying a mathematical formula to filter in or out desired Korean letters or combinations of characters relevant to the investigation, the language processing software could review documents in Korean for relevant words or phrases identified by antitrust lawyers.

8,300Hours of lawyers’ review time saved, helping them respond to the Department of Justice’s allegations more quickly

Hogan Lovells worked with technology company Knovos, which “threw developers” at the problem and quickly set up a data test centre in Seoul.

“We had [more than] 10m documents, of which 3.86m were relevant, and it selected 85,000 documents, which then needed to be reviewed by attorneys,” recalls Mr Burdge. His team also tried a more conventional approach using key words but this only narrowed the search to 500,000 documents.

Hogan Lovells estimates that it saved an estimated 8,300 hours of review time by lawyers and $400,000 in costs. The process helped lawyers respond to the DoJ’s allegations quickly.

Data from public records

Law firms are also using software to automate the collection of data from public records, saving lawyers from spending hours manually inputting information.

In the case of Indian law firm Anand and Anand, technology was vital to handle the increased volumes of data generated by the Indian patent office when the state-run organisation started a push to clear its huge backlog of cases.

The increased workload put Indian patent firms under pressure as any delayed reporting of patent notification could affect clients’ businesses.

Anand and Anand developed a tool to extract information by crawling over a huge amount of patent office data. This generated an email to clients informing them about the latest developments. The tool was more cost-effective than the law firm hiring temporary staff to handle the increased workload.

Other law firms have developed technology to extract certain pieces of information from public databases. Lawyers working on deals need to obtain data from securities databases to check questions such as whether the company has, for example, an Australian director or property assets other than real estate, says Mark Malinas, co-head of the private equity practice at law firm Allens. This is time-consuming for lawyers working to a tight deadline, who have to repeat the same checks on dozens of companies.

Allens’ legal technology team developed a system called SmartCompile, which extracts relevant company information and feeds it into a cloud-based platform, helping the lawyers put together due diligence reports more quickly.

“This way we can plug the software into the public database and pull out the information,” Mr Malinas says. “It allows our lawyers to do more interesting stuff rather than mundane data entry.”

The tables below rank law firms and in-house legal teams for the FT Innovative Lawyers Apac awards.

Data, Knowledge and Intelligence:

Rank	Law firm	Description	Originality	Leadership	Impact	Total
STANDOUT	Hogan Lovells	Acted for a South Korean conglomerate facing allegations from the US Department of Justice, which required reviewing 26m documents. The 3.86m documents that were searchable were mostly written in Korean script, notoriously difficult for natural language processing software. The firm worked with Knovos to create new, more predictable and accurate algorithms to analyse script language. The firm narrowed its manual review to 85,950 which saved crucial time, saving $400,000 and more than 8,000 hours of work by lawyers. Commended: Jeremy Burdge	7	9	8	24
HIGHLY COMMENDED	Anand and Anand	In response to a doubling of activity by the India Patent Office to clear its backlog of patent applications, the firm, working for a portfolio of clients seeking patent approval, developed an algorithm to crawl the patent office website and compare data with firm records to generate a report for attorney review. The firm estimates that it will save $100,000 in annual costs of paying experienced patent paralegals ordinarily required for this process.	7	8	7	22
HIGHLY COMMENDED	Allens	The firm developed SmartCompile, technology to extract and analyse information from public registries, to include in due diligence reports. The firm estimates that by cutting down on hours, it has already saved A$700,000.	7	7	7	21
HIGHLY COMMENDED	MinterEllison	Leveraging IBM Watson Explorer, an artificial intelligence tool for data analysis, the firm built a tool to perform more efficient reviews of multiple claims at a lower fixed-fee cost for a client. Called MEIKA, the tool is being used by the firm to manage its portfolio of work for the client.	7	7	7	21
COMMENDED	Shearman & Sterling	Introduced a pricing model and matter management index to improve transparency and billing. This helps the firm to determine whether it can profitably resource a project within budget, considering variables such as office overhead and lawyer salaries. Matters can be monitored on a dashboard to track spend in real-time.	6	7	7	20
COMMENDED	Paul Hastings	Performed a global overhaul of legal project management including dashboards, eLearning and practice area-specific information toolkits to bring together best practice and help guide junior lawyers through matters such as initial public offerings.	6	7	6	19
COMMENDED	Verus	The firm launched Riverus, a tool that analyses publicly available judicial data each day into a database that subscribers can access. Users can compare cases and types of contract to assess tax implications.	6	7	5	18

Data, Knowledge and Intelligence (In-house):

Rank	Company	Description	Originality	Leadership	Impact	Total
STANDOUT	Credit Suisse	The bank's litigation and investigations group developed an ediscovery playbook that standardised protocols for Credit Suisse staff, external counsel and legal process outsourcing companies to follow. It also offered procedures for the use of machine learning and analytics. A single dashboard enables all parties to monitor the progress of review projects in real time and make adjustments, resulting in estimated cost savings of 30 per cent and a 50 per cent reduction in review time over the past year.	7	8	7	22
HIGHLY COMMENDED	Westpac	The operations team mapped and digitised 97 individual projects handled by the in-house team, and training operations lawyers to do the same. The goal is to enhance its lawyers' digital capabilities, improve risk identification and to ensure data-driven decision making across the department.	7	8	6	21
COMMENDED	Hongkong and Shanghai Hotels	The team takes an active role in data management, working with IT from the beginning of procurement processes to implement new customer facing software and systems. This ensures they are developed with customer experience in mind to better control the flow and use of data.	6	7	6	19

Explore the Innovative Lawyers Asia-Pacific rankings 2019

Overall

FT Most Innovative Law Firms
Rule of Law and Access to Justice
Most Innovative In-house legal teams
Collaboration

Business of Law

Data, Knowledge and Intelligence
Managing and Developing Talent
Innovation in Diversity and Inclusion
New Business and Service Delivery Models
New Products and Services
Strategy and Changing Behaviours
Technology

Legal Expertise

Accessing New Markets and Capital
Enabling Business Growth and Transformation
Managing Complexity and Scale
Litigation and Disputes
Creating a New Standard

AI learns to read Korean, so you don’t have to

The challenge of character-based scripts

Data from public records

Explore the Innovative Lawyers Asia-Pacific rankings 2019

Promoted Content

Follow the topics in this article

Comments