This repository has been archived on 2026-02-24. You can view files and clone it. You cannot open issues or pull requests or push a commit.
Files
market/scrape/product/extractors/title.py
giles 478636f799 feat: decouple market from shared_lib, add app-owned models
Phase 1-3 of decoupling:
- path_setup.py adds project root to sys.path
- Market-owned models in market/models/ (market, market_place)
- All imports updated: shared.infrastructure, shared.db, shared.browser, etc.
- MarketPlace uses container_type/container_id instead of post_id FK

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-11 12:46:32 +00:00

18 lines
577 B
Python

from __future__ import annotations
from typing import Dict
from bs4 import BeautifulSoup
from shared.utils import normalize_text
from ..registry import extractor
@extractor
def ex_title(soup: BeautifulSoup, url: str) -> Dict:
title = None
for sel in ["h1.page-title span", "h1.page-title", "h1.product-name", "meta[property='og:title']"]:
el = soup.select_one(sel)
if el:
title = normalize_text(el.get_text()) if el.name != "meta" else el.get("content")
if title:
break
return {"title": title or "Product"}