Natural Language Processing (NLP) is a field of artificial intelligence that focuses on the interaction between computers and humans through natural language. In simple terms, NLP helps computers understand, interpret, and respond to human language in a way that is both meaningful and useful. NLP is widely used in applications like chatbots, voice assistants, translation tools, and search engines.
Key Tasks in NLP:¶
Here are some of the main tasks that NLP helps with:
Text Classification: Categorizing text into different categories or labels. For example, classifying an email as spam or not spam.
Tokenization: Breaking down a piece of text into smaller parts, like words or sentences. For example, breaking the sentence “I love NLP” into [“I”, “love”, “NLP”].
Part-of-Speech (POS) Tagging: Identifying the part of speech (noun, verb, adjective, etc.) for each word in a sentence. For example, “She is running” would tag “She” as a pronoun, “is” as a verb, and “running” as a verb (present continuous).
Named Entity Recognition (NER): Identifying and classifying key elements in a text, such as names of people, places, organizations, dates, etc. For example, in “Barack Obama was born in Hawaii,” the system recognizes “Barack Obama” as a person and “Hawaii” as a location.
Sentiment Analysis: Understanding the emotion or sentiment behind a piece of text. For example, analyzing product reviews to determine if they are positive, negative, or neutral.
Machine Translation: Automatically translating text from one language to another, like translating English to Hindi.
Text Summarization: Creating a short summary of a longer piece of text. For example, summarizing a news article into a few sentences.
Speech Recognition: Converting spoken language into written text, like when voice assistants (Siri, Google Assistant) understand your commands.
How NLP Works:¶
NLP uses a combination of linguistics and machine learning techniques to analyze text. Here’s how it typically works:
Preprocessing:
- Text is first cleaned up by removing unnecessary elements (like punctuation or special characters).
- It is then split into tokens (words or phrases) for easier analysis.
Feature Extraction:
- The important aspects of the text (like words, phrases, or grammar structures) are identified.
- In some cases, the text is converted into numerical data (like word embeddings) so that the computer can process it.
Modeling:
- Machine learning models (like neural networks, decision trees, or transformers) are trained on large amounts of text data to learn patterns in language.
Understanding and Generating Text:
- After training, the model can make predictions, like classifying text, answering questions, or even generating new text based on the input.
Example Applications of NLP:¶
- Chatbots and Virtual Assistants: Systems like Siri, Alexa, and Google Assistant use NLP to understand and respond to voice commands.
- Language Translation: Tools like Google Translate use NLP to automatically translate text between languages.
- Sentiment Analysis: Businesses use sentiment analysis to understand customer opinions from reviews or social media posts.
- Text Prediction: When you type a message, NLP-powered keyboards like Gboard or SwiftKey predict the next word.
Challenges in NLP:¶
- Ambiguity: Human language is often ambiguous. Words or phrases can have multiple meanings based on context (e.g., “bat” could mean a flying animal or sports equipment).
- Sarcasm and Humor: It’s difficult for machines to detect sarcasm or humor because it often relies on subtle cues and context.
- Context Understanding: While NLP models can understand individual words or sentences, understanding context across multiple sentences or paragraphs is more challenging.
Summary:¶
Natural Language Processing (NLP) is the technology behind making machines understand and interact with human language. It powers applications like chatbots, translation services, and text analysis by using techniques from both linguistics and machine learning to process and generate human language in a useful way.
Let’s Review with some explanation ¶
Tokenizer ¶
- A Tokenizer is a tool used in Natural Language Processing (NLP) to break down text into smaller units called “tokens,” such as words, phrases, or sentences. Tokenization helps in processing and analyzing text by dividing it into manageable parts. For example, the sentence “I love NLP” would be tokenized into [“I”, “love”, “NLP”]. Tokenization is a crucial first step in many NLP tasks like text analysis and machine learning.
Stop words ¶
- Stop words are common words in a language, such as “the,” “is,” “in,” and “at,” which are often removed from text during Natural Language Processing (NLP) tasks. These words don’t carry significant meaning and are filtered out to focus on more important words that help in understanding the text. Removing stop words helps in reducing the size of the data and improving the efficiency of tasks like text classification or information retrieval.
Synsets ¶
- Synsets (short for synonym sets) are a group of words or phrases in a language that share the same or nearly the same meaning. In Natural Language Processing (NLP), synsets are used to represent concepts and relationships between words. For example, the words “happy,” “joyful,” and “content” would belong to the same synset because they convey similar meanings. Synsets are commonly used in lexical databases like WordNet to help computers understand the different meanings and relationships between words.
Part of Speech ¶
- Part of Speech (POS) refers to the grammatical category of a word in a sentence based on its role, such as noun, verb, adjective, adverb, etc. In Natural Language Processing (NLP), POS tagging is the process of labeling each word in a text with its corresponding part of speech. For example, in the sentence “She is running,” “She” is tagged as a pronoun, “is” as a verb, and “running” as a verb (present continuous). POS tagging helps in understanding the structure and meaning of sentences.
Lemmas, Synonyms and Antonyms ¶
Here’s a simple explanation of lemmas, synonyms, and antonyms:
Lemmas:
A lemma is the base or dictionary form of a word. For example, the lemma of “running” is “run,” and the lemma of “better” is “good.” In Natural Language Processing (NLP), lemmatization is the process of reducing a word to its base form to simplify analysis.Synonyms:
Synonyms are words that have the same or nearly the same meaning. For example, “happy” and “joyful” are synonyms because they express similar emotions. Synonyms are useful for expanding vocabulary and improving understanding.Antonyms:
Antonyms are words with opposite meanings. For example, “hot” is the antonym of “cold,” and “big” is the antonym of “small.” Identifying antonyms is helpful in understanding contrast and context in language.
Stemming ¶
- Stemming is a process in Natural Language Processing (NLP) that reduces words to their base or root form by removing suffixes and prefixes. The goal is to simplify words to a common stem, even if the stem is not a real word. For example, “running,” “runner,” and “ran” would all be reduced to “run” through stemming.
Unlike lemmatization, which reduces words to their proper dictionary form (lemma), stemming often results in a shortened, sometimes non-dictionary form of the word. It’s useful for reducing the complexity of language data in tasks like search engines or text analysis.
Lemmatization ¶
- Lemmatization is the process of reducing words to their base or dictionary form, called a lemma, in Natural Language Processing (NLP). Unlike stemming, which just removes prefixes or suffixes, lemmatization takes into account the meaning and context of the word. For example, “running” is reduced to the lemma “run,” and “better” is reduced to “good.” Lemmatization helps in understanding the actual meaning of words and is commonly used in text processing tasks to simplify words for analysis while preserving their true form.
Sentiment Analysis¶
Sentiment Analysis is a technique in Natural Language Processing (NLP) used to determine the emotional tone or opinion expressed in a piece of text. It categorizes the sentiment as either positive, neutral, or negative based on the words and context.
Positive: The text expresses favorable or happy emotions. For example, “I love this product!” reflects a positive sentiment.
Neutral: The text is neutral, with no strong emotions or opinions. For example, “The product is available in stores.”
Negative: The text expresses negative or unfavorable emotions. For example, “I am disappointed with the product” indicates a negative sentiment.
Sentiment analysis helps businesses and researchers understand customer opinions, feedback, and overall sentiment in reviews, social media posts, and other text data.
Let’s Review with Practically ¶
Natural Language Processing¶
!pip install nltk
Requirement already satisfied: nltk in c:\users\karan\appdata\local\programs\python\python36\lib\site-packages (3.4) Requirement already satisfied: six in c:\users\karan\appdata\local\programs\python\python36\lib\site-packages (from nltk) (1.14.0) Requirement already satisfied: singledispatch in c:\users\karan\appdata\local\programs\python\python36\lib\site-packages (from nltk) (3.4.0.3)
WARNING: You are using pip version 20.1.1; however, version 20.3.3 is available. You should consider upgrading via the 'c:\users\karan\appdata\local\programs\python\python36\python.exe -m pip install --upgrade pip' command.
import nltk
nltk.download()
showing info https://raw.githubusercontent.com/nltk/nltk_data/gh-pages/index.xml
True
Tokenizer¶
from nltk.tokenize import word_tokenize, sent_tokenize
s = "HI I am Learning NLP with Itronix Solutions"
s.split()
['HI', 'I', 'am', 'Learning', 'NLP', 'with', 'Itronix', 'Solutions']
word_tokenize(s)
['HI', 'I', 'am', 'Learning', 'NLP', 'with', 'Itronix', 'Solutions']
s = "Hi I am learning NLP with Mr. Karan Arora"
s.split('.')
['Hi I am learning NLP with Mr', ' Karan Arora']
sent_tokenize(s)
['Hi I am learning NLP with Mr. Karan Arora']
s = "Hi I am Learning NLP with Mr. Karan Arora. Welcome to Itronix Solutions"
s.split(".")
['Hi I am Learning NLP with Mr', ' Karan Arora', ' Welcome to Itronix Solutions']
sent_tokenize(s)
['Hi I am Learning NLP with Mr. Karan Arora.', 'Welcome to Itronix Solutions']
EXAMPLE_TEXT = "Hello Mr. Karan Arora, how are you doing today? The weather is great, and Python is awesome. The sky is blue. You shouldn't eat chicken."
sent_tokenize(EXAMPLE_TEXT)
['Hello Mr. Karan Arora, how are you doing today?', 'The weather is great, and Python is awesome.', 'The sky is blue.', "You shouldn't eat chicken."]
EXAMPLE_TEXT.split('.')
['Hello Mr', ' Karan Arora, how are you doing today? The weather is great, and Python is awesome', ' The sky is blue', " You shouldn't eat chicken", '']
EXAMPLE_TEXT = "Hello Mr. Karan Arora, how are you doing today? The weather is great, and Python is awesome. The sky is blue. You shouldn't eat chicken."
print(EXAMPLE_TEXT)
Hello Mr. Karan Arora, how are you doing today? The weather is great, and Python is awesome. The sky is blue. You shouldn't eat chicken.
print(EXAMPLE_TEXT.split())
['Hello', 'Mr.', 'Karan', 'Arora,', 'how', 'are', 'you', 'doing', 'today?', 'The', 'weather', 'is', 'great,', 'and', 'Python', 'is', 'awesome.', 'The', 'sky', 'is', 'blue.', 'You', "shouldn't", 'eat', 'chicken.']
print(word_tokenize(EXAMPLE_TEXT))
['Hello', 'Mr.', 'Karan', 'Arora', ',', 'how', 'are', 'you', 'doing', 'today', '?', 'The', 'weather', 'is', 'great', ',', 'and', 'Python', 'is', 'awesome', '.', 'The', 'sky', 'is', 'blue', '.', 'You', 'should', "n't", 'eat', 'chicken', '.']
Stop Words¶
from nltk.corpus import stopwords
stop_words = stopwords.words("english")
print(stop_words)
['i', 'me', 'my', 'myself', 'we', 'our', 'ours', 'ourselves', 'you', "you're", "you've", "you'll", "you'd", 'your', 'yours', 'yourself', 'yourselves', 'he', 'him', 'his', 'himself', 'she', "she's", 'her', 'hers', 'herself', 'it', "it's", 'its', 'itself', 'they', 'them', 'their', 'theirs', 'themselves', 'what', 'which', 'who', 'whom', 'this', 'that', "that'll", 'these', 'those', 'am', 'is', 'are', 'was', 'were', 'be', 'been', 'being', 'have', 'has', 'had', 'having', 'do', 'does', 'did', 'doing', 'a', 'an', 'the', 'and', 'but', 'if', 'or', 'because', 'as', 'until', 'while', 'of', 'at', 'by', 'for', 'with', 'about', 'against', 'between', 'into', 'through', 'during', 'before', 'after', 'above', 'below', 'to', 'from', 'up', 'down', 'in', 'out', 'on', 'off', 'over', 'under', 'again', 'further', 'then', 'once', 'here', 'there', 'when', 'where', 'why', 'how', 'all', 'any', 'both', 'each', 'few', 'more', 'most', 'other', 'some', 'such', 'no', 'nor', 'not', 'only', 'own', 'same', 'so', 'than', 'too', 'very', 's', 't', 'can', 'will', 'just', 'don', "don't", 'should', "should've", 'now', 'd', 'll', 'm', 'o', 're', 've', 'y', 'ain', 'aren', "aren't", 'couldn', "couldn't", 'didn', "didn't", 'doesn', "doesn't", 'hadn', "hadn't", 'hasn', "hasn't", 'haven', "haven't", 'isn', "isn't", 'ma', 'mightn', "mightn't", 'mustn', "mustn't", 'needn', "needn't", 'shan', "shan't", 'shouldn', "shouldn't", 'wasn', "wasn't", 'weren', "weren't", 'won', "won't", 'wouldn', "wouldn't"]
S = "HI I am Learning NLP with Itronix Solutions"
L = word_tokenize(S)
L
['HI', 'I', 'am', 'Learning', 'NLP', 'with', 'Itronix', 'Solutions']
for word in L:
if word.lower() not in stop_words:
print(word,end=' ')
HI Learning NLP Itronix Solutions
Synsets¶
from nltk.corpus import wordnet
SynArr = wordnet.synsets('car')
SynArr
[Synset('car.n.01'), Synset('car.n.02'), Synset('car.n.03'), Synset('car.n.04'), Synset('cable_car.n.01')]
syn = SynArr[0]
print(syn.name())
print(syn.definition())
print(syn.examples())
car.n.01 a motor vehicle with four wheels; usually propelled by an internal combustion engine ['he needs a car to get to work']
from nltk.corpus import wordnet
SynArr = wordnet.synsets('car')
SynArr
[Synset('car.n.01'), Synset('car.n.02'), Synset('car.n.03'), Synset('car.n.04'), Synset('cable_car.n.01')]
syn = SynArr[0]
print(syn.name())
print(syn.definition())
print(syn.examples())
car.n.01 a motor vehicle with four wheels; usually propelled by an internal combustion engine ['he needs a car to get to work']
print(syn.hypernyms())
print(syn.hypernyms()[0].hyponyms())
[Synset('motor_vehicle.n.01')] [Synset('bloodmobile.n.01'), Synset('car.n.01'), Synset('go-kart.n.01'), Synset('golfcart.n.01'), Synset('hearse.n.01'), Synset('snowplow.n.01'), Synset('motorcycle.n.01'), Synset('doodlebug.n.01'), Synset('truck.n.01'), Synset('amphibian.n.01'), Synset('four-wheel_drive.n.01')]
Part of Speech¶
- n : Noun
- v : Verb
- a : adjective
- s : adjective satellite
- r : adverb
syn = wordnet.synsets('hello')[0]
print("Syn tag:",syn.pos())
syn = wordnet.synsets('doing')[0]
print("Syn tag:",syn.pos())
syn = wordnet.synsets('beautiful')[0]
print("Syn tag:",syn.pos())
syn = wordnet.synsets('quickly')[0]
print("Syn tag:",syn.pos())
syn = wordnet.synsets('pretty')[0]
print("Syn tag:",syn.pos())
Syn tag: n Syn tag: v Syn tag: a Syn tag: r Syn tag: s
from nltk.corpus import wordnet
word = "knife"
SynArr = wordnet.synsets(word)
print(SynArr)
[Synset('knife.n.01'), Synset('knife.n.02'), Synset('tongue.n.03'), Synset('knife.v.01')]
syn = SynArr[0]
print(syn)
print(syn.name())
print(syn.definition())
print(syn.hypernyms())
print(syn.hyponyms())
Synset('knife.n.01') knife.n.01 edge tool used as a cutting instrument; has a pointed blade with a sharp edge and a handle [Synset('edge_tool.n.01')] [Synset('barong.n.01'), Synset('bolo.n.02'), Synset('bowie_knife.n.01'), Synset('bread_knife.n.01'), Synset('butcher_knife.n.01'), Synset('carving_knife.n.01'), Synset('case_knife.n.02'), Synset('cleaver.n.01'), Synset('drawknife.n.01'), Synset('hunting_knife.n.01'), Synset('letter_opener.n.01'), Synset('linoleum_knife.n.01'), Synset('parang.n.01'), Synset('parer.n.02'), Synset('pocketknife.n.01'), Synset('pruning_knife.n.01'), Synset('slicer.n.03'), Synset('surgical_knife.n.01'), Synset('table_knife.n.01')]
syn.hyponyms()[0]
Synset('barong.n.01')
Lemmas, Synonyms and Antonyms¶
from nltk.corpus import wordnet
word = "car"
SynArr = wordnet.synsets(word)
syn = SynArr[0]
print(syn)
print(syn.name())
print(syn.pos())
print(syn.definition())
print(syn.hypernyms())
print(syn.hyponyms())
Synset('car.n.01') car.n.01 n a motor vehicle with four wheels; usually propelled by an internal combustion engine [Synset('motor_vehicle.n.01')] [Synset('ambulance.n.01'), Synset('beach_wagon.n.01'), Synset('bus.n.04'), Synset('cab.n.03'), Synset('compact.n.03'), Synset('convertible.n.01'), Synset('coupe.n.01'), Synset('cruiser.n.01'), Synset('electric.n.01'), Synset('gas_guzzler.n.01'), Synset('hardtop.n.01'), Synset('hatchback.n.01'), Synset('horseless_carriage.n.01'), Synset('hot_rod.n.01'), Synset('jeep.n.01'), Synset('limousine.n.01'), Synset('loaner.n.02'), Synset('minicar.n.01'), Synset('minivan.n.01'), Synset('model_t.n.01'), Synset('pace_car.n.01'), Synset('racer.n.02'), Synset('roadster.n.01'), Synset('sedan.n.01'), Synset('sport_utility.n.01'), Synset('sports_car.n.01'), Synset('stanley_steamer.n.01'), Synset('stock_car.n.01'), Synset('subcompact.n.01'), Synset('touring_car.n.01'), Synset('used-car.n.01')]
print(syn.lemmas())
[Lemma('car.n.01.car'), Lemma('car.n.01.auto'), Lemma('car.n.01.automobile'), Lemma('car.n.01.machine'), Lemma('car.n.01.motorcar')]
SynArr
[Synset('car.n.01'), Synset('car.n.02'), Synset('car.n.03'), Synset('car.n.04'), Synset('cable_car.n.01')]
Synn = []
for syn in SynArr:
for lem in syn.lemmas():
Synn.append(lem.name())
print(set(Synn))
{'cable_car', 'motorcar', 'railcar', 'railroad_car', 'gondola', 'auto', 'elevator_car', 'machine', 'railway_car', 'car', 'automobile'}
from nltk.corpus import wordnet
word = "hello"
SynArr = wordnet.synsets(word)
Synn = []
for syn in SynArr:
for lem in syn.lemmas():
Synn.append(lem.name())
print(set(Synn))
{'hello', 'how-do-you-do', 'hi', 'howdy', 'hullo'}
wordnet.synsets('good')[1].lemmas()[0].antonyms()[0].name()
'evil'
wordnet.synsets('good')[2].lemmas()[0].antonyms()[0].name()
'bad'
Antonyms¶
SynArr = wordnet.synsets('good')
ANT = []
for syn in SynArr:
for lem in syn.lemmas():
for ant in lem.antonyms():
ANT.append(ant.name())
ANT = list(set(ANT))
print(ANT)
['evil', 'badness', 'ill', 'evilness', 'bad']
words = ['good','bad','health']
for word in words:
SynArr = wordnet.synsets(word)
ANT = []
for syn in SynArr:
for lem in syn.lemmas():
for ant in lem.antonyms():
ANT.append(ant.name())
ANT = list(set(ANT))
print(ANT)
['evil', 'badness', 'ill', 'evilness', 'bad'] ['good', 'unregretful', 'goodness'] ['illness', 'unwellness']
Synonyms¶
from nltk.corpus import wordnet
word = "rich"
SynArr = wordnet.synsets(word)
Synn = []
for syn in SynArr:
for lem in syn.lemmas():
Synn.append(lem.name())
print(list(set(Synn)))
['robust', 'rich_people', 'full-bodied', 'plenteous', 'productive', 'fat', 'deep', 'fertile', 'rich', 'plentiful', 'racy', 'copious', 'ample']
Stemming¶
from nltk.stem import PorterStemmer
words = ["program","programs","programer","programing","programers"]
port = PorterStemmer()
for word in words:
print(port.stem(word))
program program program program program
Text = "It is very important to be pythonly while you are pythoning with python. All pythoners have pythoned poorly at least once."
from nltk.tokenize import word_tokenize
words = word_tokenize(Text)
for word in words:
print(port.stem(word))
It is veri import to be pythonli while you are python with python . all python have python poorli at least onc .
text = "He eats what he was eating yesterday at the eatery"
for word in word_tokenize(text):
print(port.stem(word))
He eat what he wa eat yesterday at the eateri
port.stem('beautiful')
'beauti'
from nltk.stem import LancasterStemmer
lstem = LancasterStemmer()
lstem.stem('beautiful')
'beauty'
from nltk.stem import RegexpStemmer
rstem = RegexpStemmer('ing')
rstem.stem('skipping')
'skipp'
print(port.stem('king'))
print(lstem.stem('king'))
print(rstem.stem('king'))
king king k
Lemmatization¶
from nltk.stem import WordNetLemmatizer
lzr = WordNetLemmatizer()
lzr.lemmatize('working',pos='v')
'work'
lzr.lemmatize('better',pos='a')
'good'
port.stem('believes')
'believ'
lzr.lemmatize('believes',pos='n')
'belief'
Sentiment Analysis¶
- Positive
- Neutral
- Negative
from nltk.sentiment import SentimentIntensityAnalyzer
nltk.sentiment.util.demo_vader_instance('bad')
{'neg': 1.0, 'neu': 0.0, 'pos': 0.0, 'compound': -0.5423}
nltk.sentiment.util.demo_vader_instance('good')
{'neg': 0.0, 'neu': 0.0, 'pos': 1.0, 'compound': 0.4404}
nltk.sentiment.util.demo_vader_instance('you are very bad')
{'neg': 0.558, 'neu': 0.442, 'pos': 0.0, 'compound': -0.5849}
nltk.sentiment.util.demo_liu_hu_lexicon('you are very bad')
Negative
nltk.sentiment.util.demo_liu_hu_lexicon('you are very good')
Positive
nltk.sentiment.util.demo_liu_hu_lexicon('you are very')
Neutral
nltk.sentiment.util.demo_vader_instance('India to chair Taliban sanctions committee, to keep focus on terrorists and sponsors')
{'neg': 0.255, 'neu': 0.745, 'pos': 0.0, 'compound': -0.6249}
nltk.sentiment.util.demo_liu_hu_lexicon('Ready to consider any proposal but would not repeal three farm laws: Narendra Tomar ahead of talks with farmers')
Positive
Tweepy¶
from tweepy import OAuthHandler
from tweepy import API
from tweepy import Cursor
consumer_key="UgF6mnAcqzpMJsFv3SpB0vQVS"
consumer_secret="Tskvee7wgOeIFVxqpgGaac9f7ELuqM8TXUuRm7NBw8yQKgzQUt"
access_token="164994337-eVoRAarPdFE1Myd2AYzkv2R5oG2OB0pRvRIjvDQK"
access_token_secret="90xHC4mZTb9pF0wtHxMvMihh6myDUYknjRD71t7NgdTtt"
auth = OAuthHandler(consumer_key,consumer_secret)
auth.set_access_token(access_token,access_token_secret)
auth_api = API(auth)
user = auth_api.me()
print(user)
User(_api=<tweepy.api.API object at 0x000000001F3559B0>, _json={'id': 164994337, 'id_str': '164994337', 'name': 'Er Karan Arora', 'screen_name': 'erkaranarora', 'location': 'Mohali, India', 'profile_location': {'id': '175ac8b36fb626a2', 'url': 'https://api.twitter.com/1.1/geo/id/175ac8b36fb626a2.json', 'place_type': 'unknown', 'name': 'Mohali, India', 'full_name': 'Mohali, India', 'country_code': '', 'country': '', 'contained_within': [], 'bounding_box': None, 'attributes': {}}, 'description': '•Founder & CEO : TheDigitalAdda\n•Founder & CEO : Adson Deal\n•Founder & CEO : IFBPRO\n•Managing Director - ITRONIX SOLUTION\n•Managing Director - ITRONIX SOLUTIONS', 'url': 'https://t.co/D33kVSu48k', 'entities': {'url': {'urls': [{'url': 'https://t.co/D33kVSu48k', 'expanded_url': 'http://www.erkaranarora.com', 'display_url': 'erkaranarora.com', 'indices': [0, 23]}]}, 'description': {'urls': []}}, 'protected': False, 'followers_count': 84, 'friends_count': 49, 'listed_count': 0, 'created_at': 'Sat Jul 10 08:36:10 +0000 2010', 'favourites_count': 79, 'utc_offset': None, 'time_zone': None, 'geo_enabled': False, 'verified': False, 'statuses_count': 234, 'lang': None, 'status': {'created_at': 'Thu Aug 20 14:55:33 +0000 2020', 'id': 1296460869736583168, 'id_str': '1296460869736583168', 'text': 'The Industry Authority for APIs and Microservices.\nAPI Designer and API Security Architect Certified\n#APIAcademy… https://t.co/BnHJ9GOBTE', 'truncated': True, 'entities': {'hashtags': [{'text': 'APIAcademy', 'indices': [101, 112]}], 'symbols': [], 'user_mentions': [], 'urls': [{'url': 'https://t.co/BnHJ9GOBTE', 'expanded_url': 'https://twitter.com/i/web/status/1296460869736583168', 'display_url': 'twitter.com/i/web/status/1…', 'indices': [114, 137]}]}, 'source': '<a href="http://twitter.com/download/android" rel="nofollow">Twitter for Android</a>', 'in_reply_to_status_id': None, 'in_reply_to_status_id_str': None, 'in_reply_to_user_id': None, 'in_reply_to_user_id_str': None, 'in_reply_to_screen_name': None, 'geo': None, 'coordinates': None, 'place': None, 'contributors': None, 'is_quote_status': False, 'retweet_count': 0, 'favorite_count': 1, 'favorited': False, 'retweeted': False, 'possibly_sensitive': False, 'lang': 'en'}, 'contributors_enabled': False, 'is_translator': False, 'is_translation_enabled': False, 'profile_background_color': '000000', 'profile_background_image_url': 'http://abs.twimg.com/images/themes/theme4/bg.gif', 'profile_background_image_url_https': 'https://abs.twimg.com/images/themes/theme4/bg.gif', 'profile_background_tile': False, 'profile_image_url': 'http://pbs.twimg.com/profile_images/918155801054900224/zmh7xDXY_normal.jpg', 'profile_image_url_https': 'https://pbs.twimg.com/profile_images/918155801054900224/zmh7xDXY_normal.jpg', 'profile_banner_url': 'https://pbs.twimg.com/profile_banners/164994337/1566613146', 'profile_link_color': '19CF86', 'profile_sidebar_border_color': '000000', 'profile_sidebar_fill_color': '000000', 'profile_text_color': '000000', 'profile_use_background_image': False, 'has_extended_profile': True, 'default_profile': False, 'default_profile_image': False, 'following': False, 'follow_request_sent': False, 'notifications': False, 'translator_type': 'none', 'suspended': False, 'needs_phone_verification': False}, id=164994337, id_str='164994337', name='Er Karan Arora', screen_name='erkaranarora', location='Mohali, India', profile_location={'id': '175ac8b36fb626a2', 'url': 'https://api.twitter.com/1.1/geo/id/175ac8b36fb626a2.json', 'place_type': 'unknown', 'name': 'Mohali, India', 'full_name': 'Mohali, India', 'country_code': '', 'country': '', 'contained_within': [], 'bounding_box': None, 'attributes': {}}, description='•Founder & CEO : TheDigitalAdda\n•Founder & CEO : Adson Deal\n•Founder & CEO : IFBPRO\n•Managing Director - ITRONIX SOLUTION\n•Managing Director - ITRONIX SOLUTIONS', url='https://t.co/D33kVSu48k', entities={'url': {'urls': [{'url': 'https://t.co/D33kVSu48k', 'expanded_url': 'http://www.erkaranarora.com', 'display_url': 'erkaranarora.com', 'indices': [0, 23]}]}, 'description': {'urls': []}}, protected=False, followers_count=84, friends_count=49, listed_count=0, created_at=datetime.datetime(2010, 7, 10, 8, 36, 10), favourites_count=79, utc_offset=None, time_zone=None, geo_enabled=False, verified=False, statuses_count=234, lang=None, status=Status(_api=<tweepy.api.API object at 0x000000001F3559B0>, _json={'created_at': 'Thu Aug 20 14:55:33 +0000 2020', 'id': 1296460869736583168, 'id_str': '1296460869736583168', 'text': 'The Industry Authority for APIs and Microservices.\nAPI Designer and API Security Architect Certified\n#APIAcademy… https://t.co/BnHJ9GOBTE', 'truncated': True, 'entities': {'hashtags': [{'text': 'APIAcademy', 'indices': [101, 112]}], 'symbols': [], 'user_mentions': [], 'urls': [{'url': 'https://t.co/BnHJ9GOBTE', 'expanded_url': 'https://twitter.com/i/web/status/1296460869736583168', 'display_url': 'twitter.com/i/web/status/1…', 'indices': [114, 137]}]}, 'source': '<a href="http://twitter.com/download/android" rel="nofollow">Twitter for Android</a>', 'in_reply_to_status_id': None, 'in_reply_to_status_id_str': None, 'in_reply_to_user_id': None, 'in_reply_to_user_id_str': None, 'in_reply_to_screen_name': None, 'geo': None, 'coordinates': None, 'place': None, 'contributors': None, 'is_quote_status': False, 'retweet_count': 0, 'favorite_count': 1, 'favorited': False, 'retweeted': False, 'possibly_sensitive': False, 'lang': 'en'}, created_at=datetime.datetime(2020, 8, 20, 14, 55, 33), id=1296460869736583168, id_str='1296460869736583168', text='The Industry Authority for APIs and Microservices.\nAPI Designer and API Security Architect Certified\n#APIAcademy… https://t.co/BnHJ9GOBTE', truncated=True, entities={'hashtags': [{'text': 'APIAcademy', 'indices': [101, 112]}], 'symbols': [], 'user_mentions': [], 'urls': [{'url': 'https://t.co/BnHJ9GOBTE', 'expanded_url': 'https://twitter.com/i/web/status/1296460869736583168', 'display_url': 'twitter.com/i/web/status/1…', 'indices': [114, 137]}]}, source='Twitter for Android', source_url='http://twitter.com/download/android', in_reply_to_status_id=None, in_reply_to_status_id_str=None, in_reply_to_user_id=None, in_reply_to_user_id_str=None, in_reply_to_screen_name=None, geo=None, coordinates=None, place=None, contributors=None, is_quote_status=False, retweet_count=0, favorite_count=1, favorited=False, retweeted=False, possibly_sensitive=False, lang='en'), contributors_enabled=False, is_translator=False, is_translation_enabled=False, profile_background_color='000000', profile_background_image_url='http://abs.twimg.com/images/themes/theme4/bg.gif', profile_background_image_url_https='https://abs.twimg.com/images/themes/theme4/bg.gif', profile_background_tile=False, profile_image_url='http://pbs.twimg.com/profile_images/918155801054900224/zmh7xDXY_normal.jpg', profile_image_url_https='https://pbs.twimg.com/profile_images/918155801054900224/zmh7xDXY_normal.jpg', profile_banner_url='https://pbs.twimg.com/profile_banners/164994337/1566613146', profile_link_color='19CF86', profile_sidebar_border_color='000000', profile_sidebar_fill_color='000000', profile_text_color='000000', profile_use_background_image=False, has_extended_profile=True, default_profile=False, default_profile_image=False, following=False, follow_request_sent=False, notifications=False, translator_type='none', suspended=False, needs_phone_verification=False)
print("Welcome",user.name,"You have successfully loged in")
Welcome Er Karan Arora You have successfully loged in
Print the info of twitter user¶
target = input("Enter Twitter Username:")
username = auth_api.get_user(target)
print("Name:",username.name)
print("Description:",username.description)
print("Status Count:",username.statuses_count)
print("Friends Count:",username.friends_count)
print("Followers Count:",username.followers_count)
print("Account Created on:",username.created_at)
Enter Twitter Username:erkaranarora Name: Er Karan Arora Description: •Founder & CEO : TheDigitalAdda •Founder & CEO : Adson Deal •Founder & CEO : IFBPRO •Managing Director - ITRONIX SOLUTION •Managing Director - ITRONIX SOLUTIONS Status Count: 234 Friends Count: 49 Followers Count: 84 Account Created on: 2010-07-10 08:36:10
Account Age¶
import datetime
acd = username.created_at
data = datetime.datetime.utcnow() - acd
data.days
3834
Average Tweets Per Day¶
tweets = username.statuses_count
float(tweets)/float(data.days)
0.06103286384976526
public_tweets = auth_api.home_timeline()
for tweet in public_tweets:
print(tweet.text)
My New Year Starts on ur Birthday Happy Birthday My Jaan Luv u alot. My Princess @ India https://t.co/vAJBvVh0nt RT @InvestorNio: $NIO catalysts Jan 0/ NIO Used Car Market Launch (1/3) 1/ Dec delivery (1/4) 2/ NIO Day 2020 (1/9) -x 2New Sedan -NT2… RT @sidharth_shukla: Happy New year to one and all .... stay blessed stay loved ... ❤️ #googletrends #2020trends https://t.co/iqKV9vmbwI Hemant Saini #Kudos It's incredible how thorough your work is #GoingAboveAndBeyond https://t.co/mXU5ToiqT8 RT @viralbhayani77: #SidharthShukIa & #ShehnaazGiII snapped at Airport today 📸✈️🤗 @viralbhayani77 https://t.co/DmXs0nX2B9 RT @Prateek1017: Guys intelligent girls and boy's choice person 2020 today #MerryChristmasEveryone #RubinaDilaik Choose one , who's your fa… RT @A_nj_ana: Choose one , who's your favourite? #RahulVaidya #BB14 #RubinaDilaik #RubinalsTheBoss #RubiNav #RahulVadiya #RahulIsTheBoss 🔁… RT @sidharth_shukla: May this Christmas season fill your home with laughter, your heart with joy and your life with love. Merry Christmas t… Essa Njie #Kudos #ThankYou very much for everything you do https://t.co/b5xcrVBtnG Nice article about the process of update state in ReactJs https://t.co/qJSrH1gSWs #react #hook #usestate via @panzerdp RT @sidharth_shukla: Heartiest Congratulations to all #SidHearts for having #1MPostsForSidheartsOnIG Great going guys keep it up ❤️ Let's have a coffee and enjoy the chilling winter . Self made latte at home . #coffemug #coffelover #coffee… https://t.co/IFoEbsTbV6 RT @sidharth_shukla: Hey guys thank you so much for making my birthday so memorable and special...really appreciate the effort and love...… Just amazing realisation of relative sizes of different space matters.. https://t.co/JFLHm21SGK RT @sidharth_shukla: To @TheRashamiDesai #ParasChabbra @MahiraSharma_ #VishalAdityaSingh ..... and who so ever it may concern ....I am now… Just posted a photo @ India https://t.co/VfFcqt8GqK I Support this kindly share as many as possible @narendramodi @nitin_gadkari @KanganaTeam https://t.co/vgM7MgNew4
for tweet in Cursor(auth_api.search, q='narendra modi').items(10):
print(tweet.text)
print(nltk.sentiment.util.demo_liu_hu_lexicon(tweet.text))
print()
RT @ANI: Prime Minister Narendra Modi should take first shot of COVID19 vaccine, then, we will also take it: RJD leader Tej Pratap Yadav ht… Neutral None RT @EastCoastRail: .@RailMinIndia Dedicated Freight Corridor - a game changer in Economic Development. Rewari - Madar section on Western co… Positive None RT @flawsome_guy: Time taken by Narendra Modi to tweet on Delhi riots: 72 hrs Time taken by Narendra Modi to tweet on US riots: 10 hrs Pr… Neutral None RT @flawsome_guy: Time taken by Narendra Modi to tweet on Delhi riots: 72 hrs Time taken by Narendra Modi to tweet on US riots: 10 hrs Pr… Neutral None RT @RanaAyyub: Omg, Indian media holding Trump responsible for inciting the mob in America. Great job but where is this passion when innoce… Positive None RT @Joydas: Narendra Modi: We should start COW Exams Secretary: Sir, it will sound ridiculous and people will Laugh Narendra Modi: Nonsense… Negative None RT @RanaAyyub: Omg, Indian media holding Trump responsible for inciting the mob in America. Great job but where is this passion when innoce… Positive None RT @Joydas: Narendra Modi: We should start COW Exams Secretary: Sir, it will sound ridiculous and people will Laugh Narendra Modi: Nonsense… Negative None RT @Joydas: Narendra Modi: We should start COW Exams Secretary: Sir, it will sound ridiculous and people will Laugh Narendra Modi: Nonsense… Negative None @SidheartChandan Bhai tag karo narendra modi fc and all big bjp handles... Neutral None