Nick2bad4u / Typpi.online Kiwifarm Project

Kiwifarm Project for scraping data from Kiwifarms

View on GitHub

KiwiFarmer

Alt

KiwiFarmer is a Python package for scraping KiwiFarms threads and posts, extracting field values, and storing the results in a created MySQL database.

Run script

KiwiFarmer includes a script (run_smat.py) for indexing all KiwiFarms posts into an Elasticsearch instance. The script uses a Redis database to keep track of which pages have already been indexed, which avoids redundant reindexing operations. The script can be run perpetually using the command:

.. code-block:: bash

watch -n0 python run_smat.py

Workflow

KiwiFarmer also includes scripts for a workflow that downloads all website pages as HTML files, extracts relevant field data, and stores the data in a MySQL database. These scripts are in the workflow/ subdirectory in the package root directory. For more information, see docs/workflow.rst

TODO

File List

# Here is a list of files included in this repository: