Renewal·마흔의 생활코딩

[Python at Forty] Hands-on Results

February 17, 2021·5 min read

cover image

Practice material - Python Securities Data Analysis

파이썬 증권 데이터 분석

웹 스크레이핑으로 증권 데이터를 주기적으로 자동 수집, 분석, 자동 매매, 예측하는 전 과정을 파이썬으로 직접 구현한다. 그 과정에서 금융 데이터 처리 기본 라이브러리(팬더스)부터 주가 예

www.aladin.co.kr

Among the practice examples, this is a writeup of outputs where I applied the code to my actual running portfolio or stocks.

Main practice result 01. Sharpe ratio

On top of the Sharpe-ratio optimized (max_sharpe, min_risk) portfolios, I plotted my own(my_position) portfolio.

(*book p.254 ~ p.267 & the author's GitHub source )

Plotting where my portfolio sits on top of the Sharpe-ratio portfolios

* Code-customization review:

Ah.. looking at the example result graph, I became curious where my portfolio would sit on it. I wanted to add it in, but I had no idea where to start. First I thought: let me put my holdings in a variable. I started with a list and struggled for a while lol, then after seeing the results get mashed up when I appended to dummy data lol, I finally moved it into an np array.

Once I put it in an np array, things started clicking into place. I computed my portfolio's return and risk, and appended it to the back of the 20,000 randomly generated portfolio combinations. Finally, I added a line to the plt graph code for that last record (my portfolio). Done

# my portfolio
my_portfolio = [0.497085, 0.286782, 0.076786, 0.058613, 0.040487, 0.040247]
my_weights = np.array(my_portfolio)
# overall portfolio return
my_returns = np.dot(my_weights, annual_ret)
# overall portfolio risk
my_risk = np.sqrt(np.dot(my_weights.T, np.dot(annual_cov, my_weights)))
# add my portfolio to the random portfolios
port_weights.append(my_weights)
port_ret.append(my_returns)
port_risk.append(my_risk)
sharpe_ratio.append(my_returns/my_risk)

# marker for my portfolio row
plt.scatter(x=my_position['Risk'], y=my_position['Returns'], c='r', marker='*', s=300)

*Review: oh yeah~ I customized it~ that satisfaction didn't last long. My impression of the usefulness of my customization wasn't that great. I felt the portfolio ratio alone doesn't give enough info. No matter how brilliant the buy weights are, if the entry position for each stock is different, the result can be completely different. I think if I first validate each individual stock's legitimacy and entry position, then use the Sharpe ratio in the final step, usefulness would improve a lot. I have a strong hunch I'll study other indicators and come back. When I do, it'll be sharper than this Sharpe of mine. For today, I'm stopping here.

Main practice result 02. bollinger_TrendFollow

The Bollinger Band, which I learned about for the first time through this Python book, seems — unlike other indicators? — to still be widely used in the field. It's a methodology that computes the standard deviation of a stock's range (upper/lower/center), then reads the probabilistic trend of movement within that range.

(*book p.275 ~ p.281 & the author's GitHub source )

(I forgot to change the title, but..) bollinger_TrendFollow for Samsung Electronics

* Review: many people seem to use Bollinger Bands as an automated trading tool to time buys and sells on a stock. Personally, I think the Bollinger Band indicator might be more useful for reading the overall market regime than for timing a single name. You might ask, "why all the sudden talk of market regimes?"... as a rookie investor, after googling around I learned there are "seasons" to the market, like weather. Financial-driven regime > earnings regime > reverse-financial regime > reverse-earnings regime, cycling like four seasons. Bollinger Bands seem really excellent for reading exactly this regime.

Personally.. for handling big-ish data, the most important thing isn't the algorithm but the preprocessing. That preprocessing involves cutting, splitting, filtering, etc. — but one critical step you can't skip is classification. Among countless pieces of market-related data, one of the biggest classifications is identifying the regime — and Bollinger Bands might be very handy for that.

Main practice result 03. RNN — Recurrent Neural Network

This is predictions from applying TensorFlow's RNN (recurrent neural network).

(*book p.432 ~ p.446 & the author's GitHub source )

* Review: the environment setup was the hardest part — harder than understanding and writing the code. Environment details in the link at the bottom... the actual values and predictions trend similarly, but every run, and every time I change the period or epochs, the results shift noticeably. It's probably because the example is meant only as a taste for learning. The image below is one I saved after several runs when the result came out relatively consistent. For reference, the period is from March last year (the Covid crash) to the day before writing, and I reduced the epochs proportionally to the shorter period.

**Installing TensorFlow normalstory.tistory.com/entry/a-1?category=977100

파이썬 3.8 64bit에 텐서플로우 2.2 설치(feat. 아나콘다)

* 개발 환경 : 샤오미 레드미 노트북 64비트, RYZEN 4000 시리즈 7. 윈도우 10 64 비트 책과 달리, 내 컴퓨터에서는 pip install tensorflow 가 안된다. 파이썬 까지는 잘됐었는데... 흑흑... pip Install Error:..

normalstory.tistory.com

Main practice result 04. LSTM recurrent neural network

This is predictions from applying TensorFlow's LSTM (Long Short-Term Memory models).

The book only covers a taste of it, so the prediction accuracy and consistency were a bit disappointing. So I started googling other examples. I found code using the recently updated TensorFlow 2, TensorBoard, and Keras to predict stock-market (single-name) prices. The image below shows the example's output. The name is Amazon. Blue is the actual value, red is the prediction.

Amazon stock-price prediction example — Python TensorFlow Keras (LSTM)

The above example uses the Yahoo Finance API. But with code that relies on the Yahoo Finance API, it's hard to use for Korean stocks. So I modified part of the code to scrape Naver Finance instead. The example is Samsung Electronics. I picked it because it had a stock split, the Covid-driven plunge and surge last year, and the February weakness — it's experienced the most dramatic moves and is the best fit.

Samsung Electronics stock-price prediction (trained on only 1/100) — Python TensorFlow Keras (LSTM)

The result above is at 1/100 of the baseline training. If I set the default EPOCHS to 500, on my laptop it takes 4–5 hours. For reference, my laptop is a Xiaomi RedmiBook 13-inch with no dedicated GPU.

I'll study for another month or so, and if I really reach the "computer is killing me" level lol, I'll consider getting a desktop with a GPU. For now, let me shrink the training size and keep going~

* RNN vs LSTM comparison reference post ratsgo.github.io/natural%20language%20processing/2017/03/09/rnnlstm/

RNN과 LSTM을 이해해보자! · ratsgo's blog

이번 포스팅에서는 Recurrent Neural Networks(RNN)과 RNN의 일종인 Long Short-Term Memory models(LSTM)에 대해 알아보도록 하겠습니다. 우선 두 알고리즘의 개요를 간략히 언급한 뒤 foward, backward compute pass를 천천

ratsgo.github.io

원문보기 - ScienceON

이 원문은 ScienceON에서 제공하고 있습니다.

scienceon.kisti.re.kr

Summary: the bidirectional LSTM recurrent neural-network model was derived as a learning model with improved performance over the unidirectional LSTM.

This English version was translated by Claude.

#개미 #마흔에 파이썬 #샤프 지수 #자동 트레이딩 시스템 #주식 #주식이 개미를 물다 #증권 데이터 분석 #파이썬

Written by

친절한 찰쓰씨

Pleasant Charles — UI/UX researcher at AIT. Keeping notes on design, planning, and slow days here since 2010.

Keep reading

Renewal

[Python at Forty] Hands-on Results

Keep reading

Steadily, for the long haul, without burning out

Tech-life balance

Humanality, by Park Jeong-ryeol