{"id":389,"date":"2018-11-25T11:40:45","date_gmt":"2018-11-24T23:40:45","guid":{"rendered":"https:\/\/isdb.cms.waikato.ac.nz\/?page_id=389"},"modified":"2018-11-25T11:53:31","modified_gmt":"2018-11-24T23:53:31","slug":"capisco-project","status":"publish","type":"page","link":"https:\/\/isdb.cms.waikato.ac.nz\/research-projects\/capisco-project\/","title":{"rendered":"Capisco project"},"content":{"rendered":"

This project developed a tool to assist scholars in identifying and selecting resources from within a Digital Library Corpus. Current access to this resource is available via text-based search in fulltext and metadata.<\/p>\n

Our Capisco System analyzes documents by the semantics of their content. Traditional access to the digitized document collections is available primarily via string-based search in the documents\u2019 full-text and metadata. Such a text-based search identifies documents purely according to lexicographical analysis. Most research questions and areas of scholarly interest, however, can rarely be described by simple textual keywords and instead, they encompass larger concepts.<\/p>\n

Capisco avoids the need for complete semantic document markup using ontologies by leveraging an automatically generated Concept-in-Context (CiC<\/em>) network. The network is seeded by a priori analysis of Wikipedia texts and identification of semantic metadata, implementing a annotate-to-wikipedia (A2W) approach. Our Capisco system disambiguates the semantics of terms in the documents by their semantics and context and identifies the relevant\u00a0CiC<\/em>\u00a0concepts.<\/p>\n

We further developed means to harness the results of our developed semantic analysis and disambiguation, while retaining the existing keyword-based search and lexicographic index. We engineer this so the output of semantic analysis (performed off-line) is suitable for import directly into existing digital library metadata and index structures, and thus incorporated without the need for architecture modifications.<\/p>\n

Capisco was developed in collaboration with the HathiTrust Research Center.<\/p>\n

Project Contact<\/h3>\n

Annika Hinze (hinze@waikato.ac.nz)<\/p>\n

 <\/p>\n

Relevant Academic Publications<\/h3>\n

Hinze, A., Bainbridge, D., Wilkins, R., Taube-Schock, C., & Downie, J. S. (2018). Seeding strategies for semantic disambiguation. In Proc 18th ACM\/IEEE Joint Conference on Digital Libraries (JCDL 2018)<\/i> (pp. 343-344). Fort Worth, Texas: ACM. doi:10.1145\/3197026.3203874<\/a><\/p>\n

Hinze, A., Bainbridge, D., Cunningham, S. J., Taube-Schock, C., Matamua, R., Downie, J. S., & Rasmussen, E. (2018). Capisco: low-cost concept-based access to digital libraries. International Journal on Digital Libraries<\/i>, Online First<\/i>, 1-28. doi:10.1007\/s00799-018-0232-3<\/a><\/p>\n

Hinze, A., Coleman, M., Cunningham, S. J., & Bainbridge, D. (2016). Semantic Bookworm: mining literary resources revisited. In Proc 16th ACM\/IEEE-CS on Joint Conference on Digital Libraries<\/i> (pp. 227-228). Newark, NJ, USA: ACM. doi:10.1145\/2910896.2925444<\/a><\/p>\n

Hinze, A., Bainbridge, D., Cunningham, S. J., & Downie, J. S. (2016). Low-cost semantic enhancement to digital library metadata and indexing: simple yet effective strategies. In Proc 16th ACM\/IEEE-CS on Joint Conference on Digital Libraries<\/i> (pp. 93-102). Newark, NJ, USA: ACM. doi:10.1145\/2910896.2910910<\/a><\/p>\n

Hinze, A., Taube-Schock, C., Bainbridge, D., Cunningham, S. J., & Downie, J. S. (2015). \u201cIntroducing Capisco: a semantically-enhanced search and discovery system for large-scale text corpora\u201d. ACM SIGWEB Newsletter<\/i>, (Autumn 2015), 1-14. doi:10.1145\/2833219.2833223<\/a><\/p>\n

Cunningham, S. J., Hinze, A. M., Bainbridge, D., Taube-Schock, C., & Ryan, T. (2014). Building heritage document collections for Pacific Island nations using semantic-enriched search. In Proceedings of the Samoa Conference III<\/i>. S\u0101moa: National University of S\u0101moa. Retrieved from http:\/\/samoanstudies.ws\/publications\/proceedings-of-the-samoa-conference-iii\/<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"

This project developed a tool to assist scholars in identifying and selecting resources from within a Digital Library Corpus. Current access to this resource is available via text-based search in fulltext and metadata. Our Capisco System analyzes documents by the semantics of their content. Traditional access to the digitized document collections is available primarily via […]<\/p>\n","protected":false},"author":4,"featured_media":0,"parent":151,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"_links":{"self":[{"href":"https:\/\/isdb.cms.waikato.ac.nz\/wp-json\/wp\/v2\/pages\/389"}],"collection":[{"href":"https:\/\/isdb.cms.waikato.ac.nz\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/isdb.cms.waikato.ac.nz\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/isdb.cms.waikato.ac.nz\/wp-json\/wp\/v2\/users\/4"}],"replies":[{"embeddable":true,"href":"https:\/\/isdb.cms.waikato.ac.nz\/wp-json\/wp\/v2\/comments?post=389"}],"version-history":[{"count":4,"href":"https:\/\/isdb.cms.waikato.ac.nz\/wp-json\/wp\/v2\/pages\/389\/revisions"}],"predecessor-version":[{"id":394,"href":"https:\/\/isdb.cms.waikato.ac.nz\/wp-json\/wp\/v2\/pages\/389\/revisions\/394"}],"up":[{"embeddable":true,"href":"https:\/\/isdb.cms.waikato.ac.nz\/wp-json\/wp\/v2\/pages\/151"}],"wp:attachment":[{"href":"https:\/\/isdb.cms.waikato.ac.nz\/wp-json\/wp\/v2\/media?parent=389"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}