Search
Duplicate
โš™๏ธ

Python Function Result Caching by Joblib.Memory

Created
2/17/2021, 12:41:43 PM
Tags
Empty

Joblib.Memory

Memory๋ฅผ ์ด์šฉํ•˜๋ฉด ์–ด๋–ค ํ•จ์ˆ˜์˜ return ๋˜๋Š” output์„ ์ง€์ •๋œ ๋””๋ ‰ํ† ๋ฆฌ์— ์ €์žฅํ•ด๋‘”๋‹ค. ๊ทธ๋ฆฌ๊ณ  ํ•ด๋‹น ํ•จ์ˆ˜๋ฅผ ๋‹ค์‹œ ํ˜ธ์ถœํ•  ๊ฒฝ์šฐ, ๋ฏธ๋ฆฌ ๊ณ„์‚ฐํ•ด๋‘” output์„ ๊ฐ€์ ธ์™€์„œ ๋กœ๋”ฉ์„ ํ•˜๊ฒŒ ๋œ๋‹ค.
๊ทธ๋ฆฌ๊ณ  ํ•ด๋‹น cache๋Š” ์‚ฌ์šฉ์ž๊ฐ€ ์ž„์˜๋กœ ์‚ญ์ œํ•˜์ง€ ์•Š์œผ๋ฉด ๊ณ„์† ์œ ์ง€๋˜๋ฉฐ, ๋‹ค๋ฅธ ํ”„๋กœ์„ธ์Šค์—์„œ๋„ ์ ‘๊ทผ ๊ฐ€๋Šฅํ•˜๋‹ค๊ณ  ํ•œ๋‹ค.(ํ™•์ธ ํ•„์š”) ์ฆ‰ ์ด๋Š” ํ•จ์ˆ˜์˜ ์—ฐ์‚ฐ ๊ฒฐ๊ณผ๋ฅผ ํ•˜๋“œ๋””์Šคํฌ์— ์บ์‹ฑํ•ด๋‘๋Š” ๋ฐฉ์‹์œผ๋กœ ๋ณผ ์ˆ˜ ์žˆ๋‹ค.
์‚ฌ์šฉ๋ฒ•์€ ๋งค์šฐ ๊ฐ„๋‹จํ•˜์—ฌ ์•„๋ž˜์™€ ๊ฐ™๋‹ค. ๋‚ด๊ฐ€ ์ •์˜ํ•œ ํ•จ์ˆ˜๋ฅผ cache()์— ์ „๋‹ฌํ•˜์—ฌ ๋‹ค์‹œ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด๋‹ค. ์ด ๋ฐฉ์‹์€ output์ด pkl๋กœ ์ €์žฅ๋œ๋‹ค.
if cache: extract_feature = Memory("./cache", verbose=0).cache(extract_feature)
Python
๋‹จ ์ฃผ์˜ํ• ์ ์€ ๋งŒ์•ฝย class method์— ๋Œ€ํ•ด์„œ ์ด๋ฅผ ์ ์šฉํ•˜๋Š” ๊ฒฝ์šฐ, ์บ์‹œ ํ• ย ํ•จ์ˆ˜๋ฅผ global ์˜์—ญ์˜ function์— ๋Œ€ํ•ด ์ •์˜ํ•˜๊ณ , ์ด๋ฅผ class method์—์„œ ํ˜ธ์ถœ ํ•˜๋Š” ์‹์œผ๋กœ ํ•ด์•ผํ•œ๋‹ค. ๊ทธ ์ด์œ ๋Š” pickle์ด class method์— ๋Œ€ํ•ด์„œ๋Š” ์ง€์›ํ•˜์ง€ ์•Š๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค.
def read_wave_with_resampling(file_path, target_sample_rate): [wave_sequence, source_sample_rate] = soundfile.read(file_path) # print('source_sample_rate,target_sample_rate',source_sample_rate,target_sample_rate) return sample_wave_sequence(wave_sequence, source_sample_rate, target_sample_rate) wave_data = self.memory.cache(read_wave_with_resampling)(wave_file_path, self.desired_sample_rate)
Python
๊ทธ๋Ÿฌ๋‚˜ ๋งŒ์•ฝ numpy๋กœ ์ „์ฒด ๋ฐ์ดํ„ฐ๋ฅผ ์ €์žฅํ•ด๋‘˜ ์ˆ˜ ์žˆ๋Š” ์ƒํ™ฉ์ด๋ผ๋ฉด, numpy.memmap๋“ฑ์„ ์ด์šฉํ•ด์„œ ์›ํ•˜๋Š” index์˜ ๋ฐ์ดํ„ฐ๋งŒ ์•„์ฃผ ๋น ๋ฅด๊ฒŒ ์ฝ์–ด์˜ฌ ์ˆ˜ ์žˆ๋‹ค.
๊ทธ๋ฆฌ๊ณ  ์‚ฌ์šฉ์ด ๋ชจ๋‘ ๋๋‚˜๋ฉด memory.clear ํ•จ์ˆ˜๋กœ ๊ฐ™์ด ์บ์‹œ๋ฐ์ดํ„ฐ๋ฅผ ์ง€์šธ ์ˆ˜ ์žˆ๋‹ค.
memory.clear๋ฅผ ํŽธํ•˜๊ฒŒ ์‚ฌ์šฉํ•˜๋Š” ๋ฐฉ๋ฒ•์œผ๋กœ๋Š” python์˜ย try/finallyย ๊ตฌ๋ฌธ์„ ์‚ฌ์šฉํ•˜๋ฉด ์šฉ์ดํ•˜๋‹ค. ๋ฌด์กฐ๊ฑด ์—๋Ÿฌ์˜ ์œ ๋ฌด์™€ ์ƒ๊ด€์—†์ด finally์— ํ•ด๋‹นํ•˜๋Š” ์ฝ”๋“œ๋ฅผ ์‹คํ–‰ํ•˜๊ธฐ ๋•Œ๋ฌธ์—, ์—ฌ๊ธฐ์—์„œ ์ž์›์„ releaseํ•˜๋ฉด ํŽธํ•˜๋‹ค.
try: start_loop() finally: memory.clear(warn=False)
Python

Joblib.memory ์‚ฌ์šฉ์‹œ ์ฃผ์˜์‚ฌํ•ญ

1.
Class method๋Š” ์•ˆ๋˜๊ณ , global function๋งŒ caching์ด ๋œ๋‹ค.
2.
Input argument ์ค‘์— ์‚ฌ์šฉ์ž๊ฐ€ ์ •์˜ํ•œ class instance๊ฐ€ ์ „๋‹ฌ๋˜๋ฉด pickling์—๋Ÿฌ๊ฐ€ ๋‚œ๋‹ค.
3.
Input argument๊ฐ€ ์—„์ฒญ๋‚˜๊ฒŒ ๋งŽ๊ณ  ๋ณต์žกํ•˜๋ฉด ์†๋„๊ฐ€ ๋งค์šฐ ๋Š๋ ค์ง„๋‹ค. ์™œ๋ƒํ•˜๋ฉด input argument์— ๋Œ€ํ•ด์„œ hashing์„ ํ•˜์—ฌ ๊ฒฐ๊ณผ๊ฐ’์„ ์ €์žฅํ•˜๋Š”๋ฐ, ์ด ๊ฐ’์ด ๋„ˆ๋ฌด ํฌ๊ณ  ๋ณต์žกํ•˜๋ฉด hash๊ฐ€ ๊ฐ€์ ธ์•ผํ•˜๋Š” key์˜ ์ข…๋ฅ˜๊ฐ€ ๋„ˆ๋ฌด ๋งŽ์•„์ง€๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค.
โ‡’ Joblib.memory๋ฅผ ๋งŽ์ด ์‚ฌ์šฉํ•ด๋ณด๋‹ˆ, ์—ฌ๋Ÿฌ๋ชจ๋กœ ์ œ์•ฝ์ด ๋งŽ๋‹ค. ์œ„์˜ ์„ธ๊ฐ€์ง€ ์ œ์•ฝ์‚ฌํ•ญ ๋ชจ๋‘ ์น˜๋ช…์ ์ด๋ผ์„œ ๊ทธ๋ƒฅ ์ฐจ๋ผ๋ฆฌ ์ˆ˜๋™์œผ๋กœ ๋ฐ์ดํ„ฐ ์ „์ฒ˜๋ฆฌ ๊ฒฐ๊ณผ๋ฅผ dictionary๋กœ ๋งŒ๋“ค์–ด์„œ pickle๋กœ ์ €์žฅํ•ด๋‘๋Š” ๊ฒŒ ๋‚˜์€ ๊ฒƒ ๊ฐ™๋‹ค.
# cached_dataset check if os.path.exists(cache_data_dir): with open(cache_data_dir, "rb") as file: self.data_list = pickle.load(file) else: # Do preprocessing # Save data in cache dir with open(cache_data_dir, "wb") as file: pickle.dump(self.data_list, file)
Python
TOP