July 5, 2021 Web Crawling & Web Scraping

Console application for downloading images from Reddit in Python

RedditImageScraper Console application for downloading images from Reddit in Python This short Python script was created for the mass-downloading of images from Reddit. It will be used later for creating data-sets for several Machine Learning projects. In order to use the script, you will have to have a Reddit account sign-up to create a developer account. You will be assigned a client_id and client_secret which you have to enter in config.ini before you run the script. Usage The -r parameter […]

June 27, 2021 Web Crawling & Web Scraping

A large scale web crawler to take an overview about security of Moroccan sites

Security Analyzer A large scale web crawler (considered also as vulnerability scanner tool) to take an overview about security of Moroccan sites Current architecture / Workflow: Implemented defence mechanisms scanners: X-Frame-Options HTTP Strict-Transport-Security X-Content-Type-Options Secure cookies HttpOnly cookies CSRF Tokens Content security policy Iframe sandboxing Implemented vulnerabilities scanners: X-XSS-Protection Mixed-content inclusions Vulnerable remote JavaScript inclusion SSL-stripping Vulnerable Form Outdated Server Software – Outdated CMS Sensitive files HTTP Parameter Pollution Information leakage Insecure SSL implementation Usage Check the examples in the […]

June 27, 2021 Web Crawling & Web Scraping

Python Crawler written Python 3

pycrawler Python Crawler written Python 3. (Supports major Python releases Python3.6, Python3.7 and Python 3.8) Installation and Use Setup VirtualEnv which python3 this will output the path of your python3 #now setup a python3 virtualenv mkvirtualenv crawl3 -p $(which python3) workon crawler python main.py -d5 http://gotchacode.com // -d5 means crawl to the depth of 5. Results: And the output is: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 50/50 [00:00<00:00, 29200.11it/s] 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 9/9 [00:00<00:00, 22563.50it/s] 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 9/9 [00:00<00:00, 21375.28it/s] 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:00<00:00, 22227.37it/s] CRAWLER STARTED: https://vinitkumar.me, will […]

June 23, 2021 Web Crawling & Web Scraping

A Python Scraper for Downloading Instagram Profile’s Posts,stories and ProfilePic

InstAstra A Python Instagram Scraper for Downloading Profile’s Posts, stories, ProfilePic and See the Details of Particular Instagram Profile. InstAstra is the an Instagram Scraper tool which lets you get Information about any Instagram Profile. This tool also let’s you Download the profile DP, all saved Stories, all posts of any instagram account. ⩥ Instaloader python module is used to make the project Work! Developed by : Jeevan-Gharate Module Used : Instaloader To Use the Tool simply Download the zip […]

June 23, 2021 Web Crawling & Web Scraping

Collects all accepted codes submitted within the given timeframe of any contest

Collects all accepted (partial and full scored) codes submitted within the given timeframe of any contest. And saves them locally with a file name . for plagiarism check. Demo: Installation: In Terminal: Run pip install selenium Run pip install pyperclip Downloads: Procedure: Type chrome://version in Chrome Search Bar according to your chrome version and Operating System Download ChromeDriver zip file Unzip (extract) the file to your preferred location Copy the path Changes: Copy the Path of chromedriver.exe (Example: C:UsersuserDownloads) add […]

June 9, 2021 Web Crawling & Web Scraping

An in-progress web scraping project built with Python

New to Streaming Scraper An in-progress web scraping project built with Python, R, and SQL. A web scraping project that retrieves TV and movie data from two sources, then transforms and stores data in a MySQL database. Data are retrieved from two different data sources: What’s on Netflix (WON) and Rotten Tomatoes (RT). RT data are cleaned and transformed with Python, while WON data are cleaned and transformed with R. All data are piped into a MySQL database, then retrieved […]

May 24, 2021 Web Crawling & Web Scraping

A tool to easily scrape youtube data using the Google API

YouTube data scraper To easily scrape any data from the youtube homepage, a youtube channel/user, search results, playlists, and a single video itself. Requires Python 3.6+ Installing $ pip3 install -r requirements.txt $ python3 Web-Youtube.py -h Works with: YouTube Homepage, youtube search, channel/user, video, and playlists Usage: Web-Youtube.py [OPTIONS] –link YouTube link –api Google/YouTube API key –comments Get comments from YouTube videos [turning on will increase program run time] –subtitles Get subtitles from YouTube videos –durationseconds Get seconds from YouTube […]

May 15, 2021 Web Crawling & Web Scraping

CNN implemented from scratch using Python and Numpy

neural-network-from-scratch CNN implemented from scratch using Python and Numpy. Project commands Install requirements make install Train model make train Execute trained make run Test make test GitHub https://github.com/Joaomlg/neural-network-from-scratch

May 6, 2021 Web Crawling & Web Scraping

Scrape all the media from an OnlyFans account with python

OnlyFans DataScraper (Python 3.9.X) Scrape all the media from an OnlyFans account – Updated regularly. Running the app via docker Build and run the image, mounting the appropriate directories: docker build -t only-fans . && docker run -it –rm –name onlyfans -v ${PWD}/.settings:/usr/src/app/.settings -v ${PWD}/.profiles:/usr/src/app/.profiles -v ${PWD}/.sites:/usr/src/app/.sites only-fans Running on Linux https://github.com/DIGITALCRIMINAL/OnlyFans/discussions/889 Running the app locally From the project folder open CMD/Terminal and run the command below: pip install -r requirements.txt Start: python start_ofd.py or double click start_ofd.py Open and […]

April 24, 2021 Web Crawling & Web Scraping

A web crawler that extract everything of an URL

httpmapper httpmapper is a web crawler made in Python. What is this? httpmapper is a web crawler made in Python. [X] Source code of an URL. [X] Links of a Website. [X] Navigate in the web just using an URL. [X] Get e-mails of a Website. [X] See Cookies of a Website. [X] Grab metadatas. Usage Python >= 3.6 git clone https://github.com/vLeeH/httpmapper.git cd httpmapper pip install -r requirements.txt python src/main.py How it works Firstable run the file main.py and enter […]

1 2 »