using python 3 to write the code count word and common word
using python 3 to write the code count word and common word
| extract_mentions: (str) -> list of str | The parameter is tweet text. This function should return a list containing all of the mentions in the tweet, in the order they appear in the tweet. Each mention in the returned list should have the initial mention symbol removed, and the list should contain every mention encountered — including repeats, if a user is mentioned more than once within a tweet. |
| extract_hashtags: (str) -> list of str | The parameter is tweet text. This function should return a list containing all of the hashtags in the tweet, in the order they appear in the tweet. Each hashtag in the returned list should have the initial hash symbol removed, and hashtags should be unique. (If a tweet uses the same hashtag twice, it is included in the list only once. The order of the hashtags should match the order of the first occurrence of each tag in the tweet.) |
| count_words: (str, dict of {str: int}) -> None | The first parameter is tweet text, and the second is a dictionary containing lowercase words as keys and integer counts as values. The function should update the counts of words in the dictionary. If a word is not the dictionary yet, it should be added. For the purposes of this function, words are defined by whitespace: every string that occurs between two pieces of whitespace (or between a piece of whitespace and either the beginning or end of the tweet) could be a word. Numeric characters are treated the same way as alphabetic characters. Hashtags, mentions, and URLs are not considered words. The empty string is not considered a word. Words don\'t contain punctuation, so punctuation should be removed from any candidate words. For example, if we are analyzing the tweet \"@utmandrew Don\'t you wish you could vote? #MakeAmericaGreatAgain\", we would increment the count for the word \"you\" by 2 and the counts for words \"dont\", \"wish\", \"could\", and \"vote\" by 1. |
| common_words: (dict of {str: int}, int) -> None | The first parameter is the dictionary of word counts as described in count_words and the second is a positive integer N. This function should update the dictionary so that it includes the most common (highest frequency words). At most N words should be included in the dictionary. If including all words with some word count would result in a dictionary with more than N words, then none of the words with that word count should be included. (that is, in the case of a tie for the N+1st most common word, omit all of the words in the tie.) |
Solution
def count_words(str, dic):
words = str.split(\' \')
for i in range(0, len(words)):
if words[i] in dic:
dic[words[i]] = dic[words[i]] + 1
else:
dic[words[i]] = 0
def common_words(dic, n):
sorted_dic_keys = sorted(dic, key = dic.get, reverse = True)
cumulative_sum = 0
temp = {}
for i in sorted_dic_keys:
if cumulative_sum + dic[i] > n:
break
else:
temp[i] = dic[i]
cumulative_sum = cumulative_sum + dic[i]
dic = temp
