Python Programming Tutorial – 35 – Word Frequency Counter (1/3)

[ad_1]
Facebook –
GitHub –
Google+ –
LinkedIn –
reddit –
Support –
thenewboston –
Twitter –


Posted

in

by

Tags:

Comments

30 responses to “Python Programming Tutorial – 35 – Word Frequency Counter (1/3)”

  1. Harsath Bill Gates Avatar

    import requests as rq
    from bs4 import BeautifulSoup
    import operator

    def getter(url):
    source_code = rq.get(url).text
    soup = BeautifulSoup(source_code,'html.parser')
    result = []
    for link in soup.findAll('a',{'class':'list-group-item'}):
    ere = str(link.string)
    final = ere.lower().split()
    for here in final:
    print(here)
    result.append(here)

    getter('https://thenewboston.com/videos.php&#39😉

  2. Muhammad Azeem Avatar

    Hy Bucky
    My code is running but didn't printing any thing
    can you help?

  3. Amin THG Avatar

    guys i'vd tried to make this count the words from the comments on this youtube vid,
    like this:
    for comment in soup.find_all("div", {"class" : "comment-renderer-text-content"}):
    but if i print out 'comment' its empty

  4. malla 8848 Avatar

    Question : How can we know that we need to import something to do certain thing ? And also the way to use it ?

  5. sleepcol Avatar

    does anyone have a link that will work for this in july 2017

  6. peng1110 Avatar

    ah i tried to write my own word counter before came across this video… this is gold for me to compare the codes and improve now…

  7. Aayush Ranjan Avatar

    I'm not able to download the module "BeautifulSoup" can anyone tell me any other source to download modules in PyCharm

  8. Shreya Kaushik Avatar

    I ran this code but ain't working:
    import requests
    from bs4 import BeautifulSoup
    import operator

    def start(url):
    word_list = []
    source_code = requests.get(url).text
    soup = BeautifulSoup(source_code, "html.parser")
    for before_text in soup.find_all("a", {"class": "result-title"}):
    content = before_text.string
    words = content.lower().split()
    for each_word in words:
    print(each_word)
    word_list.append(each_word)

    start('https://cnj.craigslist.org/search/sys')

  9. Michael Risling Avatar

    what if I wanted to access the article that the link opens then count the words in that article instead of just the title

  10. [ Elina ] Avatar

    I thought it was a nice touch that you mentioned not knowing who Richard Feynman is. It's okay to be ignorant in one area and proficient in another. We're human after all.

  11. Hernan Mendez Avatar

    before whathing this vid this is my counter for a simple string

    variable="i have a lot of have's so i'm special because of my have's"
    word="have"
    how_many_have=len(variable.split(word)) -1
    i don't know how to check how or were to use regex on python so this is what i got even if it doesn't work if you have something like havesnif (hope that's not an actual word) or anything like that

  12. Komron Aripov Avatar

    Working example as of 4/19/2017

    import requests
    from bs4 import BeautifulSoup
    import operator

    def all_words(url,):
    word_list = []
    source_code = requests.get(url).text
    soup = BeautifulSoup(source_code, "html.parser")
    for before_text in soup.find_all("a", {"class": "result-title"}):
    content = before_text.string
    words = content.lower().split()
    for each_word in words:
    print(each_word)
    word_list.append(each_word)

    all_words("https://cnj.craigslist.org/search/sys")

  13. DavidAnatolie Avatar

    Wonderful series! I avoided the inner for loop altogether by using the list method extend instead of append.

    for source_text in soup.findAll('a', {'class': "title text-semibold"}):
    text = source_text.string.strip()
    text_list = text.lower().split()
    words.extend(text_list)

  14. Dipak Anand Avatar

    This is the error is got while trying to run the same code.

    Traceback (most recent call last):
    line 153, in <module>
    import requests
    ModuleNotFoundError: No module named 'requests'

  15. Srineesh Salur Avatar

    why does it works only on certain websites only when I try in olx.in I am getting error message but when I try on craigslist it works perfectly pls help:)

  16. Doug North Avatar

    Anyone wanna help? i've got html parser in there but i cannot get anything on the screen:

    def start(url):
    pdb.set_trace()
    word_list = []
    source_code = requests.get(url).text
    soup = BeautifulSoup(source_code, "html.parser")
    for post_text in soup.findAll('a', {'class': "esc-lead-article-title"}):
    content = post_text.text
    words = content.lower().split()
    for each_word in words:
    word_list.append(each_word)
    print(word_list)

    start('https://news.google.com/')

  17. Sean McDougal Avatar

    import requests
    from bs4 import BeautifulSoup
    import operator

    def start(url):
    word_list = []
    source_code = requests.get(url).text
    soup = BeautifulSoup(source_code, "html.parser")
    for headline_text in soup.findAll('a', {'class': 'hdrlnk'}):
    content = headline_text.string
    words = content.lower().split()
    for each_word in words:
    print(each_word)
    word_list.append(each_word)

    start('https://seattle.craigslist.org/search/jjj')

  18. Ricardo Portela da Silva Avatar

    thanks for sharing your knowledge. great information!. if I would like find any word in any tag inside HTML? how I do that?

  19. Abcd Wxyz Avatar

    Dear Bucky,
    Just wanted to know, "How are you?" 😛
    – From Bucket

  20. Paul Garcia Avatar

    When I try to do from bs4 import BeautifulSoup I get unresolved reference. Help? Edit: Figured it out. Go to cmd and type pip install beautifulsoup4

  21. Shritej Chavan Avatar

    Fuck you Bucky. You don't know who Richard Feynman is ??

  22. craig burley Avatar

    i keep on getting this this error message

    To get rid of this warning, change this:

    BeautifulSoup([your markup])

    to this:

    BeautifulSoup([your markup], "html.parser")

    markup_type=markup_type))
    Traceback (most recent call last):
    File "C:/Users/burley/PycharmProjects/untitled/word frequency.py", line 16, in <module>
    start("https://thenewboston.com/forum/")
    File "C:/Users/burley/PycharmProjects/untitled/word frequency.py", line 8, in start
    soup= BeautifulSoup(source_code)
    File "C:Users*********LocalProgramsPythonPython35-32libsite-packagesbs4__init__.py", line 176, in _init_
    elif len(markup) <= 256:
    TypeError: object of type 'Response' has no len()

    help me plsssssss

  23. Mohammad Mahjoub Avatar

    I need help! Why this code is not working?
    import requests
    from bs4 import BeautifulSoup
    import operator
    url = " https://santabarbara.craigslist.org"
    def start(url):
    word_list = [ ]
    SourceCode = BeautifulSoup.get(url).text
    soup = BeautifulSoup(SourceCode)
    for post_text in soup.findAll("span", {"class" : "txt" }):
    contact = post_text.string
    words = contact.lower().split()
    for each_word in words:
    print(each_word)
    word_list.append(each_word)
    start(url)

  24. videovulcan Avatar

    content = post_text.text
    SEEMS TO FIX THAT NONETYPE FOR THE .LOWER()

  25. Sad Mo Avatar

    I wrote the exact same code but at the line where it says content = post_text.string, I get nothing after pressing "." what's wrong with it? Where is the string function?

  26. Sujeto Irreductible Avatar

    What if the important content that identifies the object we want to obtain from the source code is ouside the <a>????
    how can we do it?

  27. rungus24 Avatar

    Did anyone else see the 'with money you can buy gender, but not love' quote in that html page? Just goes to show that with a dictionary, you can buy a translation, but not understanding, or something.

  28. Batman Avatar

    it only print only last sentence from that particular page
    like this:

    c
    programming
    tutorial

    33

    challenge
    #1!

Leave a Reply

Your email address will not be published. Required fields are marked *