The list of WARCs is cached in a directory specified by the cache
argument to ccwarcs_options
get_cc_index(urls, crawls, .options = NULL)
urls | A vector of URLs of captured pages, allowing |
---|---|
crawls | A vector of Ids of CC crawls to search Values in |
.options | An optional object of class ccwarcs_options |
A tibble
# not run: # url <- "http://www.celebuzz.com/2017-01-04" # crawl <- "2018-47" # results <- get_cc_index(url, crawl)