[ad_1]
Facebook –
GitHub –
Google+ –
LinkedIn –
reddit –
Support –
thenewboston –
Twitter –
Python Programming Tutorial – 35 – Word Frequency Counter (1/3)
by
Tags:
Comments
30 responses to “Python Programming Tutorial – 35 – Word Frequency Counter (1/3)”
-
import requests as rq
from bs4 import BeautifulSoup
import operatordef getter(url):
source_code = rq.get(url).text
soup = BeautifulSoup(source_code,'html.parser')
result = []
for link in soup.findAll('a',{'class':'list-group-item'}):
ere = str(link.string)
final = ere.lower().split()
for here in final:
print(here)
result.append(here) -
Hy Bucky
My code is running but didn't printing any thing
can you help? -
guys i'vd tried to make this count the words from the comments on this youtube vid,
like this:
for comment in soup.find_all("div", {"class" : "comment-renderer-text-content"}):
but if i print out 'comment' its empty -
the website buckysroom is down
-
Question : How can we know that we need to import something to do certain thing ? And also the way to use it ?
-
does anyone have a link that will work for this in july 2017
-
ah i tried to write my own word counter before came across this video… this is gold for me to compare the codes and improve now…
-
I'm not able to download the module "BeautifulSoup" can anyone tell me any other source to download modules in PyCharm
-
I ran this code but ain't working:
import requests
from bs4 import BeautifulSoup
import operatordef start(url):
word_list = []
source_code = requests.get(url).text
soup = BeautifulSoup(source_code, "html.parser")
for before_text in soup.find_all("a", {"class": "result-title"}):
content = before_text.string
words = content.lower().split()
for each_word in words:
print(each_word)
word_list.append(each_word) -
what if I wanted to access the article that the link opens then count the words in that article instead of just the title
-
I thought it was a nice touch that you mentioned not knowing who Richard Feynman is. It's okay to be ignorant in one area and proficient in another. We're human after all.
-
before whathing this vid this is my counter for a simple string
variable="i have a lot of have's so i'm special because of my have's"
word="have"
how_many_have=len(variable.split(word)) -1
i don't know how to check how or were to use regex on python so this is what i got even if it doesn't work if you have something like havesnif (hope that's not an actual word) or anything like that -
Working example as of 4/19/2017
import requests
from bs4 import BeautifulSoup
import operatordef all_words(url,):
word_list = []
source_code = requests.get(url).text
soup = BeautifulSoup(source_code, "html.parser")
for before_text in soup.find_all("a", {"class": "result-title"}):
content = before_text.string
words = content.lower().split()
for each_word in words:
print(each_word)
word_list.append(each_word)all_words("https://cnj.craigslist.org/search/sys")
-
Wonderful series! I avoided the inner for loop altogether by using the list method extend instead of append.
for source_text in soup.findAll('a', {'class': "title text-semibold"}):
text = source_text.string.strip()
text_list = text.lower().split()
words.extend(text_list) -
This is the error is got while trying to run the same code.
Traceback (most recent call last):
line 153, in <module>
import requests
ModuleNotFoundError: No module named 'requests' -
why does it works only on certain websites only when I try in olx.in I am getting error message but when I try on craigslist it works perfectly pls help:)
-
Anyone wanna help? i've got html parser in there but i cannot get anything on the screen:
def start(url):
pdb.set_trace()
word_list = []
source_code = requests.get(url).text
soup = BeautifulSoup(source_code, "html.parser")
for post_text in soup.findAll('a', {'class': "esc-lead-article-title"}):
content = post_text.text
words = content.lower().split()
for each_word in words:
word_list.append(each_word)
print(word_list)start('https://news.google.com/')
-
import requests
from bs4 import BeautifulSoup
import operatordef start(url):
word_list = []
source_code = requests.get(url).text
soup = BeautifulSoup(source_code, "html.parser")
for headline_text in soup.findAll('a', {'class': 'hdrlnk'}):
content = headline_text.string
words = content.lower().split()
for each_word in words:
print(each_word)
word_list.append(each_word) -
thanks for sharing your knowledge. great information!. if I would like find any word in any tag inside HTML? how I do that?
-
Dear Bucky,
Just wanted to know, "How are you?" 😛
– From Bucket -
When I try to do from bs4 import BeautifulSoup I get unresolved reference. Help? Edit: Figured it out. Go to cmd and type pip install beautifulsoup4
-
Fuck you Bucky. You don't know who Richard Feynman is ??
-
i keep on getting this this error message
To get rid of this warning, change this:
BeautifulSoup([your markup])
to this:
BeautifulSoup([your markup], "html.parser")
markup_type=markup_type))
Traceback (most recent call last):
File "C:/Users/burley/PycharmProjects/untitled/word frequency.py", line 16, in <module>
start("https://thenewboston.com/forum/")
File "C:/Users/burley/PycharmProjects/untitled/word frequency.py", line 8, in start
soup= BeautifulSoup(source_code)
File "C:Users*********LocalProgramsPythonPython35-32libsite-packagesbs4__init__.py", line 176, in _init_
elif len(markup) <= 256:
TypeError: object of type 'Response' has no len()help me plsssssss
-
I need help! Why this code is not working?
import requests
from bs4 import BeautifulSoup
import operator
url = " https://santabarbara.craigslist.org"
def start(url):
word_list = [ ]
SourceCode = BeautifulSoup.get(url).text
soup = BeautifulSoup(SourceCode)
for post_text in soup.findAll("span", {"class" : "txt" }):
contact = post_text.string
words = contact.lower().split()
for each_word in words:
print(each_word)
word_list.append(each_word)
start(url) -
content = post_text.text
SEEMS TO FIX THAT NONETYPE FOR THE .LOWER() -
TY !
-
I wrote the exact same code but at the line where it says content = post_text.string, I get nothing after pressing "." what's wrong with it? Where is the string function?
-
What if the important content that identifies the object we want to obtain from the source code is ouside the <a>????
how can we do it? -
Did anyone else see the 'with money you can buy gender, but not love' quote in that html page? Just goes to show that with a dictionary, you can buy a translation, but not understanding, or something.
-
it only print only last sentence from that particular page
like this:c
programming
tutorial
–
33
–
challenge
#1!
Leave a Reply