Skip to content
GitLab
Explore
Projects
Groups
Topics
Snippets
Projects
Groups
Topics
Snippets
/
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Register
Sign in
Toggle navigation
Menu
Tarento
delivery-excellence
digital-assets
python-webscraping-quickstart
Commits
df95d0bc
Unverified
Commit
df95d0bc
authored
2 years ago
by
Pushkar Chauhan
Committed by
GitHub
2 years ago
Browse files
Options
Download
Patches
Plain Diff
Update README.md
added Setup, Documentation, API Reference, Authors
parent
82941106
main
develop
1 merge request
!1
addition: main_server_code, scripts, docs
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
README.md
+77
-1
README.md
with
77 additions
and
1 deletion
+77
-1
README.md
+
77
−
1
View file @
df95d0bc
# python-crawler-quickstart
Python based Web crawler Quick Start Project
Python based Web crawler Quick Start Project.
For Scraping the project uses Selenium & Scrapy framework.
## Setup
*
Clone this repositary
```
git clone "https://github.com/dileep-gadiraju/python-webscraping-quickstart"
```
*
After cloning, Install python packages by running the following command from
`./src`
.
```
pip install -r "requirements.txt"
```
*
Start ElasticSearch,Kibana services as docker-containers.
(refer: https://www.elastic.co/guide/en/kibana/current/docker.html)
*
Import API-collections from
`./test`
for REST client tool.
*
Set required global variables
*
Run below command from
`./src`
to start the Server.
```
python app.py
```
Successful local deployment should show Server is up on port 5001.
## Documentation
For Scripting and configuration documentation, refer
`./docs`
folder
## API Reference
#### Get all Agents
```
GET /general/agents
```
_No paramenters Required_
#### Start a Scraping Job
```
POST /general/run
```
_The following are mandatory Request Body Parameters_
| Parameter | Type | Description |
| :-------- | :------- | :-------------------------------- |
|
`agentId`
|
`string`
|
`Valid AGENT-ID`
|
|
`type`
|
`string`
|
`Valid Type Of JOB`
|
|
`search`
|
`string`
|
`my search query`
|
#### Get Job Status
```
GET /general/status
```
| Parameter | Type | Description |
| :-------- | :------- | :-------------------------------- |
|
`JobId`
|
`string`
|
`(required) uuid of a job`
|
## Authors
-
[
@dileep-gadiraju
](
https://github.com/dileep-gadiraju
)
-
[
@Pushkar-Chauhan
](
https://github.com/Pushkar191098
)
-
[
@dhiru579
](
https://github.com/dhiru579
)
-
[
@ArchakGAmruth
](
https://github.com/ArchakGAmruth
)
This diff is collapsed.
Click to expand it.
Write
Preview
Supports
Markdown
0%
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment
Menu
Explore
Projects
Groups
Topics
Snippets