Agent skill

arxiv-search

Search arXiv for preprints in physics, math, CS, quantitative biology, quantitative finance, statistics, electrical engineering, economics. Use when: (1) finding preprints by topic, (2) searching by author, (3) browsing arXiv categories, (4) getting paper metadata/abstracts. NOT for: published journal articles (use crossref-search), biomedical (use pubmed-search).

View SKILL.md on GitHub Repository

Stars 571

Forks 57

Install this agent skill to your Project

npx add-skill https://github.com/beita6969/ScienceClaw/tree/main/skills/arxiv-search

Metadata

Additional technical details for this skill

openclaw: { "emoji": "\ud83d\udcc4", "requires": { "bins": [ "curl" ] } }

SKILL.md

arXiv Search

Search arXiv preprints via public API. Covers physics, math, CS, q-bio, q-fin, statistics, electrical engineering, and economics.

API Endpoint

bash

curl -s "http://export.arxiv.org/api/query?search_query=all:transformer+attention&start=0&max_results=5"

Parameters: search_query= (required), id_list= (direct lookup by arXiv ID), start= (pagination offset), max_results= (default 10, max 30000), sortBy=relevance|lastUpdatedDate|submittedDate, sortOrder=ascending|descending.

Query Syntax

Field prefixes: ti: title, au: author, abs: abstract, co: comment, jr: journal ref, cat: category, all: all fields.

Boolean: AND, OR, ANDNOT. Example:

bash

curl -s "http://export.arxiv.org/api/query?search_query=au:bengio+AND+cat:cs.LG+AND+ti:attention&max_results=10"

Category Codes

Physics: astro-ph (.CO/.EP/.GA/.HE/.IM/.SR), cond-mat (.dis-nn/.mes-hall/.mtrl-sci/.soft/.stat-mech/.str-el/.supr-con), hep-ex, hep-lat, hep-ph, hep-th, quant-ph, gr-qc, nucl-ex, nucl-th

CS: cs.AI, cs.CL (NLP), cs.CV, cs.LG (ML), cs.CR, cs.DB, cs.DS, cs.SE, cs.RO

Math: math.AG, math.AP, math.CO, math.PR, math.ST

Other: q-bio (.BM/.CB/.GN/.MN/.NC/.PE/.QM/.SC/.TO), q-fin (.CP/.EC/.GN/.MF/.PM/.PR/.RM/.ST/.TR), stat (.AP/.CO/.ME/.ML/.OT/.TH), eess (.AS/.IV/.SP/.SY), econ (.EM/.GN/.TH)

Response Parsing

The API returns Atom XML. Parse with Python:

bash

curl -s "http://export.arxiv.org/api/query?search_query=ti:large+language+model&max_results=5&sortBy=submittedDate&sortOrder=descending" | python3 -c "
import sys, xml.etree.ElementTree as ET
ns = {'a': 'http://www.w3.org/2005/Atom'}
root = ET.parse(sys.stdin).getroot()
for entry in root.findall('a:entry', ns):
    title = entry.find('a:title', ns).text.strip().replace('\n', ' ')
    aid = entry.find('a:id', ns).text.strip().split('/abs/')[-1]
    pub = entry.find('a:published', ns).text[:10]
    authors = ', '.join(a.find('a:name', ns).text for a in entry.findall('a:author', ns))
    print(f'[{aid}] {pub} | {title}')
    print(f'  Authors: {authors}\n')
"

Direct Lookup and Pagination

bash

# By ID
curl -s "http://export.arxiv.org/api/query?id_list=2301.07041,2302.13971"

# Pagination
curl -s "http://export.arxiv.org/api/query?search_query=cat:cs.AI&start=0&max_results=25&sortBy=submittedDate&sortOrder=descending"
curl -s "http://export.arxiv.org/api/query?search_query=cat:cs.AI&start=25&max_results=25&sortBy=submittedDate&sortOrder=descending"

Rate Limiting

No official limit, but keep to 1 request per 3 seconds for bulk queries. For large-scale harvesting, use the OAI-PMH bulk access endpoint instead.

Best Practices

Use sortBy=submittedDate&sortOrder=descending for latest papers.
Combine cat: with keyword searches for targeted results.
Check opensearch:totalResults in the response for total match count.
For PDF access, replace /abs/ with /pdf/ in the paper URL.
Use id_list for direct lookups (faster and more reliable).
URL-encode spaces as + in query terms.

Zero-Hallucination Rule

NEVER fabricate results from training data. Every paper title, author, DOI, PMID, citation count, and metadata detail presented to the user MUST come from an actual API response in this conversation. If the API returns no results or partial data, report exactly what was returned. Do not "fill in" missing details from memory.

Maintainer

beita6969 Core maintainer

Source details

Full Name: beita6969/ScienceClaw
Branch: main
Path in repo: skills/arxiv-search
License: MIT License
Topics: ai mcp llm openclaw ai-agent research science bioinformatics scientific-research meta-analysis research-tools pubmed literature-review self-evolving zero-hallucination

Featured Tools

Join Our Newsletter

571 57

Explore

Didn't find tool you were looking for?

Search AI Tools

Install this agent skill to your Project

Metadata

SKILL.md

arXiv Search

API Endpoint

Query Syntax

Category Codes

Response Parsing

Direct Lookup and Pagination

Rate Limiting

Best Practices

Zero-Hallucination Rule

Recommended Agent Skills

diffs

feishu-wiki

feishu-perm

feishu-doc

feishu-drive

lobster