Search

Kenneth Heafield Phones & Addresses

  • Asheville, NC
  • New York, NY
  • Menlo Park, CA
  • 3938 Nantasket St, Pittsburgh, PA 15207
  • Cambridge, MA
  • Los Angeles, CA
  • Bloomfield, MI

Publications

Us Patents

Identification Of Topics In Source Code

View page
US Patent:
8209665, Jun 26, 2012
Filed:
Sep 17, 2008
Appl. No.:
12/212534
Inventors:
Girish Maskeri Rama - Bangalore, IN
Kenneth Heafield - Pittsburgh PA, US
Santonu Sarkar - Bangalore, IN
Assignee:
Infosys Limited - Bangalore
International Classification:
G06F 9/44
US Classification:
717122, 717107, 717108, 717115, 717116, 717121
Abstract:
Topics in source code can be identified using Latent Dirichlet Allocation (LDA) by receiving source code, identifying domain specific keywords from the source code, generating a keyword matrix, processing the keyword matrix and the source code using LDA, and outputting a list of topics. The list of topics is output as collections of domain specific keywords. Probabilities of domain specific keywords belonging to their respective topics can also be output. The keyword matrix comprises weighted sums of occurrences of domain specific keywords in the source code.

Systems And Methods For Identifying Similar Documents

View page
US Patent:
7958136, Jun 7, 2011
Filed:
Mar 18, 2008
Appl. No.:
12/050626
Inventors:
Taylor Curtis - Santa Monica CA, US
Kenneth Heafield - Cambridge MA, US
Assignee:
Google Inc. - Mountain View CA
International Classification:
G06F 17/30
US Classification:
707758, 707705, 707802
Abstract:
The present invention provides systems and methods for identifying similar documents. In an embodiment, the present invention identifies similar documents by (1) receiving document text for a current document that includes at least one word; (2) calculating a prominence score and a descriptiveness score for each word and each pair of consecutive words; (3) calculating a comparison metric for the current document; (4) finding at least one potential document, where document text for each potential document includes at least one of the words; and (5) analyzing each potential document to identify at least one similar document.
Kenneth J Heafield from Asheville, NC, age ~38 Get Report