☆ Yσɠƚԋσʂ ☆@lemmy.ml to World News@lemmy.mlEnglish · 7 months agoAnsar Allah fight extracts heavy cost from Pentagonthehill.comexternal-linkmessage-square6fedilinkarrow-up131arrow-down13
arrow-up128arrow-down1external-linkAnsar Allah fight extracts heavy cost from Pentagonthehill.com☆ Yσɠƚԋσʂ ☆@lemmy.ml to World News@lemmy.mlEnglish · 7 months agomessage-square6fedilink
minus-squaredavel [he/him]@lemmy.mllinkfedilinkEnglisharrow-up6·7 months agoIt seems thehill.com is rejecting based on User-Agent. Anyone know what User-Agent Lemmy claims to be? $ curl -sI https://thehill.com/policy/defense/4501958-houthi-fight-pentagon-cost/ | grep '^HTTP' HTTP/2 403 $ curl -sIA 'Mozilla/5.0 (X11; Linux x86_64; rv:10.0) Gecko/20100101 Firefox/10.0' https://thehill.com/policy/defense/4501958-houthi-fight-pentagon-cost/ | grep '^HTTP' HTTP/2 200
minus-square☆ Yσɠƚԋσʂ ☆@lemmy.mlOPlinkfedilinkarrow-up5·7 months agoLooks like Lemmy/<version> https://github.com/LemmyNet/lemmy/blob/d3737d44532b43b079a1be13400e3beb7453cac4/crates/api_common/src/request.rs#L35
minus-squaredavel [he/him]@lemmy.mllinkfedilinkEnglisharrow-up5·7 months ago$ curl -sIA 'Lemmy/0.19.3; blah blah' https://thehill.com/policy/defense/4501958-houthi-fight-pentagon-cost/ | grep '^HTTP' HTTP/2 200 Must be something else. I’m not familiar with “px-captcha”. It seems to come up in HUMAN’s docs: https://duckduckgo.com/?q=site%3Aedocs.humansecurity.com+“px-captcha”
minus-square☆ Yσɠƚԋσʂ ☆@lemmy.mlOPlinkfedilinkarrow-up6·7 months agooh interesting, I guess there’s some heuristic it does to detect non browser clients.
It seems thehill.com is rejecting based on User-Agent. Anyone know what User-Agent Lemmy claims to be?
$ curl -sI https://thehill.com/policy/defense/4501958-houthi-fight-pentagon-cost/ | grep '^HTTP' HTTP/2 403 $ curl -sIA 'Mozilla/5.0 (X11; Linux x86_64; rv:10.0) Gecko/20100101 Firefox/10.0' https://thehill.com/policy/defense/4501958-houthi-fight-pentagon-cost/ | grep '^HTTP' HTTP/2 200
Looks like
Lemmy/<version>
https://github.com/LemmyNet/lemmy/blob/d3737d44532b43b079a1be13400e3beb7453cac4/crates/api_common/src/request.rs#L35$ curl -sIA 'Lemmy/0.19.3; blah blah' https://thehill.com/policy/defense/4501958-houthi-fight-pentagon-cost/ | grep '^HTTP' HTTP/2 200
Must be something else. I’m not familiar with “px-captcha”. It seems to come up in HUMAN’s docs: https://duckduckgo.com/?q=site%3Aedocs.humansecurity.com+“px-captcha”
oh interesting, I guess there’s some heuristic it does to detect non browser clients.