Unboundlocalerror: Local Variable 'soup' Referenced Before Assignment
Solution 1:
import requests
from bs4 import BeautifulSoup
def main_spider(max_pages):
page = 1
while page < max_pages:
url = "https://en.wikipedia.org/wiki/Star_Wars" + str(page)
source_code = requests.get(url)
plain_text = source_code.text
soup = BeautifulSoup(plain_text)
for link in soup.findAll("a"):
href = link.get("href")
print(href)
page += 1main_spider(1)
In your case, soup
has a local scope of the while loop, so you can only access it inside the while.
Since it appears you're doing soup on single pages (and using the while loop to move between pages), I believe you want your soup.findAll('a')
to be inside the while loop (AKA on a per page basis).
Solution 2:
UnboundLocalError
means there is a code path where a local variable is not assigned before use. In this case, soup
is used after a while
loop assigning the variable completes. The code doesn't consider the case where the while
loop never runs.
That exposes other bugs. First, the for
loop should be indented so that it runs inside the while
. Second, why didn't the outer loop run? That was simply a case of a typo in the conditional: <+
should have been <=
.
Post a Comment for "Unboundlocalerror: Local Variable 'soup' Referenced Before Assignment"