I want to pickup the last part of every url for further date format processing, how could I do that?
url: href="/news/this-super-chic-paris-hotel-is-hosting-dinners-in-its-swimming-pool-092520"
result: 092520
Thanks JC
I want to pickup the last part of every url for further date format processing, how could I do that?
url: href="/news/this-super-chic-paris-hotel-is-hosting-dinners-in-its-swimming-pool-092520"
result: 092520
Thanks JC
Based on this quick benchmark, the fastest obvious method is rpartition
:
import timeit
def t(f, *args):
timer = timeit.Timer(lambda: f(*args))
loops, time = timer.autorange()
print(f"{f.__name__:<10s}: {loops / time:.2f} ops/s ({loops} loops in {time})")
def split(x):
return x.rsplit("-", 1)[-1]
def part(x):
return x.rpartition("-")[2]
def rindex(x):
return x[x.rindex("-") + 1 :]
url = "/news/this-super-chic-paris-hotel-is-hosting-dinners-in-its-swimming-pool-092520"
# Check our implementations for sanity
assert split(url) == part(url) == rindex(url) == "092520"
# Benchmark all of them
t(split, url)
t(part, url)
t(rindex, url)
On my Mac (Python 3.8), this outputs
split : 1715847.40 ops/s (500000 loops in 0.291)
part : 1897332.28 ops/s (500000 loops in 0.264)
rindex : 1510618.06 ops/s (500000 loops in 0.331)
(of which quite some overhead surely is the lambda
trampoline)