Skip to content

Instantly share code, notes, and snippets.

View kse0202's full-sized avatar
๐Ÿ˜Š

Seongeun Kwon kse0202

๐Ÿ˜Š
  • Seoul, Republic of Korea
View GitHub Profile
@kse0202
kse0202 / concat_files.md
Created June 28, 2021 08:52
ํด๋” ์•ˆ์— ์žˆ๋Š” ํŒŒ์ผ(excel, csv)๋“ค ํ•œ๊ฐœ๋กœ ํ•ฉ์นœ dataframe ๋งŒ๋“ค๊ธฐ

ํด๋”์•ˆ์— ์žˆ๋Š” ํŒŒ์ผ๋“ค ํ•˜๋‚˜๋กœ ๋งŒ๋“ค๊ธฐ

import pandas as pd
import os
import glob


all_data = []
@kse0202
kse0202 / def_hddd_to_wgs84.md
Created June 28, 2021 07:26
hddd ํ˜•์‹์œผ๋กœ ํ‘œํ˜„๋œ ์ขŒํ‘œ๋ฅผ wgs84 ํ˜•์‹์œผ๋กœ ๋ณ€ํ™˜ํ•˜๋Š” ํ•จ์ˆ˜ ์ •์˜

hddd_to_wgs84(table,x,y) ํ•จ์ˆ˜ ์ •์˜ -> ์ด๊ฑธ ์ด์šฉํ•ด์„œ ๋ณ€ํ™˜ํ•œ๋‹ค

hddd.dddddโ€œ ๋Š” ๋„ ๋‹จ์œ„๋ฅผ ๋”์šฑ ์„ธ๋ถ„ํ™” ํ•˜๊ธฐ ์œ„ํ•ด ๋„ ๋‹จ์œ„์˜ ์†Œ์ˆ˜ ์ž๋ฆฌ๊นŒ์ง€ ํ‘œํ˜„ํ•œ ํ˜•ํƒœ

# table = ์ขŒํ‘œ๋ฅผ ๊ณ„์‚ฐํ•ด์•ผ ํ•  ํ…Œ์ด๋ธ” ๋ช…
# x, y  = table์•ˆ์˜ x, y ์ปฌ๋Ÿผ ๋ช…, 'x','y'๋กœ ๋„ฃ์–ด์ค€๋‹ค

def hddd_to_wgs84(table,x,y):
@kse0202
kse0202 / postgresql_1.md
Created June 28, 2021 07:16
postgresql์˜ geometry ํ•จ์ˆ˜(st_point, st_makeline, st_setstrid), leadํ•จ์ˆ˜(python์œผ๋กœ ํ•˜๋Š”๊ฒƒ๋„)

ST_SetSRID, geometry ์ปฌ๋Ÿผ์— SRID ์ง€์ •

ST_SetSRID(st_point(start_x ::double precision, start_y::double precision), 4326)

ST_Transdorm, SRID ๋ณ€๊ฒฝ

query_set_srid = "ALTER TABLE table_nm \
                    ALTER COLUMN geom TYPE geometry(point, 5179)\
@kse0202
kse0202 / python_postgres_table_download_update.md
Last active July 11, 2023 04:43
Jupyter Notebook์—์„œ python์œผ๋กœ 1. postgresql DB์— ์ ‘์† 2. DB๋‚ด ํ…Œ์ด๋ธ” DataFrame์œผ๋กœ ๊ฐ€์ ธ์˜ค๊ธฐ 3. DataFrame์„ ๊ธฐ์กด DB๋‚ด ํ…Œ์ด๋ธ”์— ์—…๋ฐ์ดํŠธ
#-*-coding:utf-8
import psycopg2
import pandas as pd
import numpy as np
import csv

import sql
from sqlalchemy import create_engine
@kse0202
kse0202 / python_postgres_db_connect.md
Last active June 28, 2021 05:33
Jupyter Notebook์—์„œ python์œผ๋กœ PostgreSQL DB ์ ‘์†ํ•˜๋Š” ๋ฐฉ๋ฒ• (psycopg2 ์ด์šฉ)
# ํŒจํ‚ค์ง€ ๋ถˆ๋Ÿฌ์˜ค๊ธฐ

#-*-coding:utf-8
import psycopg2
import pandas as pd
import csv

import sql
from sqlalchemy import create_engine
@kse0202
kse0202 / remove_emoji.md
Last active June 9, 2021 12:52
python reํŒจํ‚ค์ง€๋กœ ํ…์ŠคํŠธ ์ „์ฒ˜๋ฆฌ

python reํŒจํ‚ค์ง€๋ฅผ ์ด์šฉํ•˜์—ฌ ํ•œ๊ธ€ ์ž์Œ, ํ•œ๊ธ€ ๋ชจ์Œ, ํŠน์ˆ˜๋ฌธ์ž, ์ด๋ชจํ‹ฐ์ฝ˜ ์‚ญ์ œํ•˜๊ธฐ

import re


def get_clean_text(df):
@kse0202
kse0202 / youtube_crawler_20210512.md
Last active June 9, 2021 12:31
์œ ํŠœ๋ธŒ ๋Œ“๊ธ€ ํฌ๋กค๋ง 20210512

selenium์„ ์ด์šฉํ•˜์—ฌ ์œ ํŠœ๋ธŒ ๋Œ“๊ธ€ ํฌ๋กค๋งํ•˜๊ธฐ

  1. ํ‚ค์›Œ๋“œ ์ž…๋ ฅ ํ›„ ๊ฒ€์ƒ‰๋˜๋Š” ์˜์ƒ์˜ ์ œ๋ชฉ, url, ์ •๋ณด๋ฅผ ์ˆ˜์ง‘ํ•˜์—ฌ ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„ ๋ฐ csv๋กœ ์ €์žฅ
  2. ์ €์žฅ๋œ url์„ ์กฐํšŒํ•˜์—ฌ ์˜์ƒ์˜ ๋Œ“๊ธ€์„ ์ˆ˜์ง‘ํ•˜์—ฌ ๋ฐ์ดํ„ฐ ํ”„๋ ˆ์ž„ ๋ฐ csv๋กœ ์ €์žฅ
  3. ๋ชจ๋“  ๋Œ“๊ธ€์„ 1๊ฐœ์˜ csv๋กœ ์ €์žฅ

1. ํŒจํ‚ค์ง€ ๋ถˆ๋Ÿฌ์˜ค๊ธฐ

# -*- coding:utf-8 -*-
@kse0202
kse0202 / remove_blanks_right_left.md
Last active September 21, 2020 09:43
์™ผ์ชฝ ์˜ค๋ฅธ์ชฝ ๊ณต๋ฐฑ ์ง€์šฐ๊ธฐ
import re

def clean_blank(text):
    
    cleaned_text = text.lstrip() #์™ผ์ชฝ ๊ณต๋ฐฑ ์ œ๊ฑฐ
    cleaned_text = cleaned_text.rstrip() #์˜ค๋ฅธ์ชฝ ๊ณต๋ฐฑ ์ œ๊ฑฐ
 return cleaned_text```
@kse0202
kse0202 / select_rows_contains_word.md
Created May 14, 2020 02:12
๋ฐ์ดํ„ฐ ํ”„๋ ˆ์ž„์—์„œ ํŠน์ • ๋‹จ์–ด ํฌํ•จํ•˜๋Š” ํ–‰ ์„ ํƒํ•˜๊ธฐ
df.loc[df_mapo['col_name'].str.contains('words', na= False)]
@kse0202
kse0202 / sampling.md
Created April 1, 2020 02:24
๋ถˆ๊ท ํ˜• ๋ฐ์ดํ„ฐ๋ฅผ ์ฒ˜๋ฆฌํ•˜๊ธฐ ์œ„ํ•œ ์ƒ˜ํ”Œ๋ง ์ข…๋ฅ˜

UnderSampling

  • RandomUnderSampler: random under-sampling method
  • TomekLinks: Tomekโ€™s link method
  • CondensedNearestNeighbour: condensed nearest neighbour method
  • OneSidedSelection: under-sampling based on one-sided selection method
  • EditedNearestNeighbours: edited nearest neighbour method
  • NeighbourhoodCleaningRule: neighbourhood cleaning rule

OverSampling

  • RandomOverSampler: random sampler