US Patent:
20210224584, Jul 22, 2021
Inventors:
- Dublin, IE
Yao A. Yang - San Francisco CA, US
Saeideh Shahrokh Esfahani - Mountain View CA, US
Andrew E. Fano - Lincolnshire IL, US
David William Vinson - San Francisco CA, US
Timothy M. Shea - Merced CA, US
International Classification:
G06K 9/62
G06N 5/00
G06N 20/10
Abstract:
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for clustering data are disclosed. In one aspect, a method includes the actions of receiving feature vectors. The actions further include accessing rules that each relate one or more values of the feature vectors to a respective label of a plurality of labels. The actions further include, based on the rules, generating heuristics that each identify related values of the feature vectors. The actions further include, for each of the heuristics, generating a matrix that reflects a similarity of the feature vectors. The actions further include, based on the matrices that each reflects a respective similarity of the feature vectors, generating clusters that each include a subset of the feature vectors. The actions further include, for each cluster, determining a label of the plurality of labels.