Building Knowledge Bases with Universal Schema: Cold Start and Slot-Filling Approaches Benjamin Roth Nicholas Monath David Belanger Emma Strubell Patrick Verga Andrew McCallum Department of Computer Science University of Massachusetts Amherst Amherst, MA, 01003, USA beroth@cs.umass.edu Abstract We compare the performance of two different relation prediction architectures based on the same relation predictors. The knowledge base construction archi- tecture builds a complete knowledge base for the entire corpus, and commits to entity linking and clustering decisions ahead of time. The query-driven slot filling architecture can make entity ex- pansion and retrieval decisions on the fly, and has the flexibility to trade pre- cision for recall. We use a wide range of established and novel techniques for our relation extraction components. They include distant supervision-based clas- sifiers (SVM and convolutional neural nets), rule-based extractors, and semi- supervised matrix embedding methods taking into account all co-occurrences of surface patterns and entities in the corpus (universal schema). 1 Overview UMass IESL participated in both Cold Start tasks: KB construction and Slot Filling. While the relation prediction relies on the same mod- els for both tasks, we have developed different, task-dependent system architectures for each setting. The KB construction task requires a complete KB to be built ahead of time. This includes clustering the entire set of entity men- tions into disambiguated KB entities, and con- necting the entities by predicted relations. For the Slot-Filling (SF) Cold Start setting, the knowledge base has to be constructed only par- tially at query time, starting from the specified query entities. The SF setting is less rigid than the KB setting. Since the entity mentions are not pre-clustered, entity expansion techniques are query centered and leave more room for controlling precision and recall. Since it has been shown that current Slot-Filling systems mainly suffer from low recall, this may be a desirable property. On the other hand, hav- ing a complete, query-independent knowledge base (as in the KB construction setting) may open new avenues for joint reasoning, knowl- edge discovery and filtering. Which of the settings is more appropriate in a real-world scenario will depend on the partic- ular circumstances. It is, however, interesting to understand what the exact tradeoffs are between the two settings. Having access to two differ- ent high-level architectures that use the same re-