Self-Powered Mission 01. Turning the Books at Home into a Database
2. Coding
2-2) Crawling it in
1. Goal
Based on the titles recorded in the database, crawl to pull in the values I want.
2. Execution plan
1) Research
- A crawler using Snoopy http://happycgi.com/16469
2) Coding
- Adapting the reference blog's example : code
3) An issue? pops up
- Hmm.. this isn't actually the result I wanted.
- What I wanted was to pull in the detailed info for specific keywords I care about..
2-3) Using a search API
1. Revising the goal
Based on the titles in the database, use a search API to pull in the values I want.
2. Execution plan
1) Research
(1) Public Data Portal
-> Out of the columns I had set up, there's no price info, and instead of a category name there's a subject name... the interpretation seems a bit different.
Catch you next time~
(2) The book part of Naver's Search API
-> It has all the info I wanted.
* By the way, Kakao also provides an API. But... signing up fresh was a pain so I'm just going with Naver.
* Each online bookstore also provides APIs! In a way those could be the original — a bit closer to the source.
What's more, if someone buys a book from a page provided through each online bookstore's API ^^
they pay out a certain % of the purchase amount~ Jackpot!
Once the basic test is done, this looks like a solid second self-powered mission ^^
2) Setting up Naver developer https://developers.naver.com/main/
(1) Register an application to use the API, and
* On the next page it asks for the applied URL (domain).. you can put your personal domain
or just localhost — either works.
(2) On the API settings page,
click into your newly created application and
you can get the $client_id = "******" info and the $client_secret = "******" info.
3) Coding
I was able to pull in the desired values as an array. (Source:
2-3) Bringing XML into HTML
1. Goal
Restructure the XML-formatted tags as HTML. This was the hardest part... ㅜㅜ
It's too obvious to most people, I guess, so not many cover it.. but for beginners this is exactly the hard part.. ㅜㅜ
2. Execution plan
1) Research
silqia's study blog https://chamggae.tistory.com/78
2) Coding
Following the blog above, I parsed the XML ~ and successfully dropped it into a table~ (Source:
3) Review
As I worked through it — and looked at the reference blog — it seems fetching as JSON would be easier than XML.
XML forces you to declare each variable individually, stepping through A -> A' -> a,
while with JSON, a, b, c.. are already each their own object, which makes things less of a hassle.
From the next example on, I'll switch over to JSON.
