Text Analytics 1What role does text retrieval have in text a
Text Analytics
1.What role does text retrieval have in text analytics and mining?
2.Which type of word tells us the most about a document, high or low entropy H(x) words?
3. Name two limitations of natural language processing (NLP).
5. Why is statistical analysis used instead of NLP?
Solution
1.What role does text retrieval have in text analytics and mining?
Answer : Text retrieval typically deals with parsing , crawling and indexing document.Mining is the process of extracting some meaningful information from a collection of meaningless data .Retrieval is the study about think carefully about most effective ways of retrieving that extracted information to user needs.
2.Which type of word tells us the most about a document, high or low entropy H(x) words?
Answer : Low entropy tells us the most about a document rather than high entropy.
3. Name two limitations of natural language processing (NLP).
Answer :
Following are the limitations of NLP :
a.) One of the biggest and noticable limitation now you may apparently notice is machine translation.We cann\'t fully trust on Google translate because most of the times we need to make modificatio for a best translation. Language modeling and alignment have been a challenging issue for researchers to improve machine translation(MT).
b.) The main limitation of \"modern NLP technologies\" is their dependency on huge computing power. Artificial neural networks are far from matching the efficiency of the brain when it comes to process terabytes of data. As a consequence deep learning based NLP tools are reduced to analyse samples of the Big Text Data, which, in the case of email surveillance for example, is just not enough.
5. Why is statistical analysis used instead of NLP?
Answer :
The goal of statistical analysis is to identify trends(dealing with the analysis,collection, interpretation, organization and presentation of data ).For example, Retail buisness use statistical analysis to find patterns in unstructured and semi-structured customer data that can be used to create a more positive customer experience and increase sales instead using of NLP.
