April 2017, 2(2): 97-106. doi: 10.3934/bdia.2017001

First steps in the investigation of automated text annotation with pictures

York University, Dept. of Electrical Engineering and Computer Science, 4700 Keele Street, Toronto, Ontario, M3J 1P3, Canada

* Corresponding author: Kent Poots *

Published  April 2017

We describe the investigation of automatic annotation of text with pictures, where knowledge extraction uses dependency parsing. Annotation of text with pictures, a form of knowledge visualization, can assist understanding. The problem statement is, given a corpus of images and a short passage of text, extract knowledge (or concepts), and then display that knowledge in pictures along with the text to help with understanding. A proposed solution framework includes a component to extract document concepts, a component to match document concepts with picture metadata, and a component to produce an amalgamated output of text and pictures. A proof-of-concept application based on the proposed framework provides encouraging results

Citation: J. Kent Poots, Nick Cercone. First steps in the investigation of automated text annotation with pictures. Big Data & Information Analytics, 2017, 2 (2) : 97-106. doi: 10.3934/bdia.2017001
References:
[1]

B. Coyne and R. Sproat, WordsEye: An automatic text-to-scene conversion system, Proceedings of the 28th annual conference on Computer graphics and interactive techniques(2), 3 (2003), 487-496.

[2]

D. Genzel, K. Macherey and J. Uszkoreit, Creating a high-quality machine translation system for a low-resource language: Yiddish, (2009), Available from: www.mt-archive.info/MTS-2009-Genzel.pdf

[3]

A. Handler, An empirical study of semantic similarity in WordNet and Word2Vec, Columbia University (2014).

[4]

D. JoshiJ. Z. Wang and J. Li, The Story Picturing Engine-a system for automatic text illustration, ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP), 2 (2006), 68-89.

[5]

J. McCarty, Programs with common sense, Defense Technical Information Center (1963).

[6]

J. K. Poots and E. Bagheri, Automatic annotation of text with pictures, (in-press), IEEE IT Professional (2016).

[7]

S. RoseD. EngelN. Cramer and W. Cowley, Automatic keyword extraction from individual documents, Text Mining, (2010), 1-20.

[8]

V. UrenP. CimianoJ. IriaS. HandschuhM. Vargas-VeraE. Motta and F. Ciravegna, Semantic annotation for knowledge management: Requirements and a survey of the state of the art, Web Semantics: science, services and agents on the World Wide Web, 4 (2006), 14-28.

[9]

N. UzZamanJ. P. Bigham and J. F. Allen, Multimodal summarization of complex sentences, Proceedings of the 16th international conference on Intelligent user interfaces, 2 (2004), 43-52.

[10]

T. VealeA. Conway and B. Collins, The challenges of cross-modal translation: English-to-Sign-Language translation in the Zardoz system, Machine Translation, 13 (1998), 81-106.

[11]

L. ZhaoK. KipperW. SchulerC. VoglerN. Badle and M. Palmer, A machine translation system from English to American Sign Language, Envisioning Machine Translation in the Information Future, (2000), 54-67.

show all references

References:
[1]

B. Coyne and R. Sproat, WordsEye: An automatic text-to-scene conversion system, Proceedings of the 28th annual conference on Computer graphics and interactive techniques(2), 3 (2003), 487-496.

[2]

D. Genzel, K. Macherey and J. Uszkoreit, Creating a high-quality machine translation system for a low-resource language: Yiddish, (2009), Available from: www.mt-archive.info/MTS-2009-Genzel.pdf

[3]

A. Handler, An empirical study of semantic similarity in WordNet and Word2Vec, Columbia University (2014).

[4]

D. JoshiJ. Z. Wang and J. Li, The Story Picturing Engine-a system for automatic text illustration, ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP), 2 (2006), 68-89.

[5]

J. McCarty, Programs with common sense, Defense Technical Information Center (1963).

[6]

J. K. Poots and E. Bagheri, Automatic annotation of text with pictures, (in-press), IEEE IT Professional (2016).

[7]

S. RoseD. EngelN. Cramer and W. Cowley, Automatic keyword extraction from individual documents, Text Mining, (2010), 1-20.

[8]

V. UrenP. CimianoJ. IriaS. HandschuhM. Vargas-VeraE. Motta and F. Ciravegna, Semantic annotation for knowledge management: Requirements and a survey of the state of the art, Web Semantics: science, services and agents on the World Wide Web, 4 (2006), 14-28.

[9]

N. UzZamanJ. P. Bigham and J. F. Allen, Multimodal summarization of complex sentences, Proceedings of the 16th international conference on Intelligent user interfaces, 2 (2004), 43-52.

[10]

T. VealeA. Conway and B. Collins, The challenges of cross-modal translation: English-to-Sign-Language translation in the Zardoz system, Machine Translation, 13 (1998), 81-106.

[11]

L. ZhaoK. KipperW. SchulerC. VoglerN. Badle and M. Palmer, A machine translation system from English to American Sign Language, Envisioning Machine Translation in the Information Future, (2000), 54-67.

Figure 1.  Processing for Text to Picture System
Figure 2.  The Hierarchy of Meaning
Figure 4.  Proposed Annotation With Pictures Framework
Figure 3.  Proposed Annotation With Pictures Framework
Figure 5.  Example of Text Automatically Annotated With Pictures From UzZaman et al.[10]
Table 1.  Core Components for Text Picturing Implementation
1 Knowledge Representation Subject / Verb / Object using Stanford form dependencies
2 Knowledge Extraction RAKE baseline, Stanford, TextRazor, Deppattern parsers
3 Test Sentences 4 short sentences including French translation
4 Image Database Google-Image pictures of nouns, verbs in Sign Language
5 Text/Image Matching Binary match
1 Knowledge Representation Subject / Verb / Object using Stanford form dependencies
2 Knowledge Extraction RAKE baseline, Stanford, TextRazor, Deppattern parsers
3 Test Sentences 4 short sentences including French translation
4 Image Database Google-Image pictures of nouns, verbs in Sign Language
5 Text/Image Matching Binary match
Table 2.  Core Components -Test Results Summary
1 Extract baseline terms using RAKE RAKE extracted meaningful terms
2 Constituency and dependency parse Stanford and TextRazor gave same result;
Depparse results varied
3 Compare input text to SVO SVO was adequate for basic sentences
4 Create rendered scene Scene matched SVO
5 Knowledge Extract, Rendering Gaps? SVO sometimes needs additional terms;
consider verb valency.
6 Compare to prior work Renderings provided equivalent detail.
7 Evaluate output pictures alone Pictures could help understanding
pictures not a replacement.
1 Extract baseline terms using RAKE RAKE extracted meaningful terms
2 Constituency and dependency parse Stanford and TextRazor gave same result;
Depparse results varied
3 Compare input text to SVO SVO was adequate for basic sentences
4 Create rendered scene Scene matched SVO
5 Knowledge Extract, Rendering Gaps? SVO sometimes needs additional terms;
consider verb valency.
6 Compare to prior work Renderings provided equivalent detail.
7 Evaluate output pictures alone Pictures could help understanding
pictures not a replacement.
Table 3.  Research Objectives vs. Actual Results
Objective Actual Result
1 Evaluate feasibility Feasibility was demonstrated
2 Identify CL topic areas Topics include IR, KE, parsing, matching,
cognitive science (perception)
3 Propose a framework Proposed, demonstrated dependency parsing, binary match
4 Provide Test Results Demonstrated SVO model for short sentences;
may need to consider term valence.
Objective Actual Result
1 Evaluate feasibility Feasibility was demonstrated
2 Identify CL topic areas Topics include IR, KE, parsing, matching,
cognitive science (perception)
3 Propose a framework Proposed, demonstrated dependency parsing, binary match
4 Provide Test Results Demonstrated SVO model for short sentences;
may need to consider term valence.
[1]

Nan Liu, Yong Ye. Humanitarian logistics planning for natural disaster response with Bayesian information updates. Journal of Industrial & Management Optimization, 2014, 10 (3) : 665-689. doi: 10.3934/jimo.2014.10.665

[2]

Yang Yu. Introduction: Special issue on computational intelligence methods for big data and information analytics. Big Data & Information Analytics, 2017, 2 (1) : i-ii. doi: 10.3934/bdia.201701i

[3]

Bas Janssens. Infinitesimally natural principal bundles. Journal of Geometric Mechanics, 2016, 8 (2) : 199-220. doi: 10.3934/jgm.2016004

[4]

Karim El Laithy, Martin Bogdan. Synaptic energy drives the information processing mechanisms in spiking neural networks. Mathematical Biosciences & Engineering, 2014, 11 (2) : 233-256. doi: 10.3934/mbe.2014.11.233

[5]

Reuven Segev. Book review: Marcelo Epstein, The Geometrical Language of Continuum Mechanics. Journal of Geometric Mechanics, 2011, 3 (1) : 139-143. doi: 10.3934/jgm.2011.3.139

[6]

M. L. Bertotti, Sergey V. Bolotin. Chaotic trajectories for natural systems on a torus. Discrete & Continuous Dynamical Systems - A, 2003, 9 (5) : 1343-1357. doi: 10.3934/dcds.2003.9.1343

[7]

Daniel Grieser. A natural differential operator on conic spaces. Conference Publications, 2011, 2011 (Special) : 568-577. doi: 10.3934/proc.2011.2011.568

[8]

Irina Kareva, Faina Berezovkaya, Georgy Karev. Mixed strategies and natural selection in resource allocation. Mathematical Biosciences & Engineering, 2013, 10 (5&6) : 1561-1586. doi: 10.3934/mbe.2013.10.1561

[9]

Roya Soltani, Seyed Jafar Sadjadi, Mona Rahnama. Artificial intelligence combined with nonlinear optimization techniques and their application for yield curve optimization. Journal of Industrial & Management Optimization, 2017, 13 (4) : 1701-1721. doi: 10.3934/jimo.2017014

[10]

Rui Wang, Denghua Zhong, Yuankun Zhang, Jia Yu, Mingchao Li. A multidimensional information model for managing construction information. Journal of Industrial & Management Optimization, 2015, 11 (4) : 1285-1300. doi: 10.3934/jimo.2015.11.1285

[11]

Vikram Krishnamurthy, William Hoiles. Information diffusion in social sensing. Numerical Algebra, Control & Optimization, 2016, 6 (3) : 365-411. doi: 10.3934/naco.2016017

[12]

Subrata Dasgupta. Disentangling data, information and knowledge. Big Data & Information Analytics, 2016, 1 (4) : 377-389. doi: 10.3934/bdia.2016016

[13]

Apostolis Pavlou. Asymmetric information in a bilateral monopoly. Journal of Dynamics & Games, 2016, 3 (2) : 169-189. doi: 10.3934/jdg.2016009

[14]

Ioannis D. Baltas, Athanasios N. Yannacopoulos. Uncertainty and inside information. Journal of Dynamics & Games, 2016, 3 (1) : 1-24. doi: 10.3934/jdg.2016001

[15]

Vieri Benci, C. Bonanno, Stefano Galatolo, G. Menconi, M. Virgilio. Dynamical systems and computable information. Discrete & Continuous Dynamical Systems - B, 2004, 4 (4) : 935-960. doi: 10.3934/dcdsb.2004.4.935

[16]

Shenzhou Zheng, Xueliang Zheng, Zhaosheng Feng. Optimal regularity for $A$-harmonic type equations under the natural growth. Discrete & Continuous Dynamical Systems - B, 2011, 16 (2) : 669-685. doi: 10.3934/dcdsb.2011.16.669

[17]

Martin Lara, Sebastián Ferrer. Computing long-lifetime science orbits around natural satellites. Discrete & Continuous Dynamical Systems - S, 2008, 1 (2) : 293-302. doi: 10.3934/dcdss.2008.1.293

[18]

Ravi Vakil and Aleksey Zinger. A natural smooth compactification of the space of elliptic curves in projective space. Electronic Research Announcements, 2007, 13: 53-59.

[19]

Faming Fang, Fang Li, Tieyong Zeng. Reducing spatially varying out-of-focus blur from natural image. Inverse Problems & Imaging, 2017, 11 (1) : 65-85. doi: 10.3934/ipi.2017004

[20]

Lubomir Kostal, Shigeru Shinomoto. Efficient information transfer by Poisson neurons. Mathematical Biosciences & Engineering, 2016, 13 (3) : 509-520. doi: 10.3934/mbe.2016004

 Impact Factor: 

Metrics

  • PDF downloads (11)
  • HTML views (266)
  • Cited by (0)

Other articles
by authors

[Back to Top]