287 lines
7.0 KiB
Markdown
287 lines
7.0 KiB
Markdown
|
|
||
|
# oembed-extractor
|
||
|
|
||
|
Extract oEmbed content from given URL.
|
||
|
|
||
|
[![NPM](https://badge.fury.io/js/@extractus%2Foembed-extractor.svg)](https://badge.fury.io/js/@extractus%2Foembed-extractor)
|
||
|
![CodeQL](https://github.com/extractus/oembed-extractor/workflows/CodeQL/badge.svg)
|
||
|
[![CI test](https://github.com/extractus/oembed-extractor/workflows/ci-test/badge.svg)](https://github.com/extractus/oembed-extractor/actions)
|
||
|
[![Coverage Status](https://coveralls.io/repos/github/extractus/oembed-extractor/badge.svg)](https://coveralls.io/github/extractus/oembed-extractor)
|
||
|
|
||
|
## Demo
|
||
|
|
||
|
- [Give it a try!](https://extractor-demos.pages.dev/oembed-extractor)
|
||
|
- [Example FaaS](https://extractus.deno.dev/extract?apikey=rn0wbHos2e73W6ghQf705bdF&type=oembed&url=https://www.instagram.com/tv/CVlR5GFqF68/)
|
||
|
|
||
|
|
||
|
## Install & Usage
|
||
|
|
||
|
### Node.js
|
||
|
|
||
|
```bash
|
||
|
npm i @extractus/oembed-extractor
|
||
|
|
||
|
# pnpm
|
||
|
pnpm i @extractus/oembed-extractor
|
||
|
|
||
|
# yarn
|
||
|
yarn add @extractus/oembed-extractor
|
||
|
```
|
||
|
|
||
|
```ts
|
||
|
// es6 module
|
||
|
import { extract } from '@extractus/oembed-extractor'
|
||
|
|
||
|
// CommonJS
|
||
|
const { extract } = require('@extractus/oembed-extractor')
|
||
|
|
||
|
// or specify exactly path to CommonJS variant
|
||
|
const { extract } = require('@extractus/oembed-extractor/dist/cjs/oembed-extractor.js')
|
||
|
|
||
|
const result = await extract('https://www.youtube.com/watch?v=x2bqscVkGxk')
|
||
|
console.log(result)
|
||
|
```
|
||
|
|
||
|
### Deno
|
||
|
|
||
|
```ts
|
||
|
// deno < 1.28
|
||
|
import { extract } from 'https://esm.sh/@extractus/oembed-extractor'
|
||
|
|
||
|
// deno > 1.28
|
||
|
import { extract } from 'npm:@extractus/oembed-extractor'
|
||
|
|
||
|
const result = await extract('https://www.youtube.com/watch?v=x2bqscVkGxk')
|
||
|
console.log(result)
|
||
|
```
|
||
|
|
||
|
### Browser
|
||
|
|
||
|
```ts
|
||
|
import { extract } from "https://esm.sh/@extractus/oembed-extractor@latest"
|
||
|
```
|
||
|
|
||
|
Please check [the examples](examples) for reference.
|
||
|
|
||
|
|
||
|
## APIs
|
||
|
|
||
|
### `.extract()`
|
||
|
|
||
|
Load and extract oembed data.
|
||
|
|
||
|
#### Syntax
|
||
|
|
||
|
```ts
|
||
|
extract(String url)
|
||
|
extract(String url, Object params)
|
||
|
extract(String url, Object params, Object fetchOptions)
|
||
|
```
|
||
|
|
||
|
#### Parameters
|
||
|
|
||
|
##### `url` *required*
|
||
|
|
||
|
URL of a valid oEmbed resource, e.g. `https://www.youtube.com/watch?v=x2bqscVkGxk`
|
||
|
|
||
|
##### `params` *optional*
|
||
|
|
||
|
Optional argument `params` can be useful when you want to specify some additional customizations.
|
||
|
|
||
|
Here are several popular params:
|
||
|
|
||
|
- `maxwidth`: max width of embed size
|
||
|
- `maxheight`: max height of embed size
|
||
|
- `theme`: e.g, `dark` or `light`
|
||
|
- `lang`: e.g, 'en', 'fr', 'cn', 'vi', etc
|
||
|
|
||
|
Note that some params are supported by these providers but not by the others.
|
||
|
Please see the provider's oEmbed API docs carefully for exact information.
|
||
|
|
||
|
##### `fetchOptions` *optional*
|
||
|
|
||
|
`fetchOptions` is an object that can have the following properties:
|
||
|
|
||
|
- `headers`: to set request headers
|
||
|
- `proxy`: another endpoint to forward the request to
|
||
|
- `agent`: a HTTP proxy agent
|
||
|
- `signal`: AbortController signal or AbortSignal timeout to terminate the request
|
||
|
|
||
|
You can use this param to set request headers to fetch.
|
||
|
|
||
|
For example:
|
||
|
|
||
|
```js
|
||
|
import { extract } from '@extractus/oembed-extractor'
|
||
|
|
||
|
const url = 'https://codepen.io/ndaidong/pen/LYmLKBw'
|
||
|
extract(url, null, {
|
||
|
headers: {
|
||
|
'user-agent': 'Opera/9.60 (Windows NT 6.0; U; en) Presto/2.1.1'
|
||
|
}
|
||
|
})
|
||
|
```
|
||
|
|
||
|
You can also specify a proxy endpoint to load remote content, instead of fetching directly.
|
||
|
|
||
|
For example:
|
||
|
|
||
|
```js
|
||
|
import { extract } from '@extractus/oembed-extractor'
|
||
|
|
||
|
const url = 'https://codepen.io/ndaidong/pen/LYmLKBw'
|
||
|
extract(url, null, {
|
||
|
headers: {
|
||
|
'user-agent': 'Opera/9.60 (Windows NT 6.0; U; en) Presto/2.1.1'
|
||
|
},
|
||
|
proxy: {
|
||
|
target: 'https://your-secret-proxy.io/loadJson?url=',
|
||
|
headers: {
|
||
|
'Proxy-Authorization': 'Bearer YWxhZGRpbjpvcGVuc2VzYW1l...'
|
||
|
}
|
||
|
}
|
||
|
})
|
||
|
```
|
||
|
|
||
|
With the above setting, request will be forwarded to `https://your-secret-proxy.io/loadJson?url={OEMBED_ENDPOINT}`.
|
||
|
|
||
|
Another way to work with proxy is use `agent` option instead of `proxy` as below:
|
||
|
|
||
|
```js
|
||
|
import { extract } from '@extractus/oembed-extractor'
|
||
|
|
||
|
import { HttpsProxyAgent } from 'https-proxy-agent'
|
||
|
|
||
|
const proxy = 'http://abc:RaNdoMpasswORd_country-France@proxy.packetstream.io:31113'
|
||
|
|
||
|
const url = 'https://codepen.io/ndaidong/pen/LYmLKBw'
|
||
|
|
||
|
const oembed = await extract(url, null, {
|
||
|
agent: new HttpsProxyAgent(proxy),
|
||
|
})
|
||
|
console.log('Run oembed-extractor with proxy:', proxy)
|
||
|
console.log(oembed)
|
||
|
```
|
||
|
|
||
|
For more info about [https-proxy-agent](https://www.npmjs.com/package/https-proxy-agent), check [its repo](https://github.com/TooTallNate/proxy-agents).
|
||
|
|
||
|
By default, there is no request timeout. You can use the option `signal` to cancel request at the right time.
|
||
|
|
||
|
The common way is to use AbortControler:
|
||
|
|
||
|
```js
|
||
|
const controller = new AbortController()
|
||
|
|
||
|
// stop after 5 seconds
|
||
|
setTimeout(() => {
|
||
|
controller.abort()
|
||
|
}, 5000)
|
||
|
|
||
|
const oembed = await extract(url, null, {
|
||
|
signal: controller.signal,
|
||
|
})
|
||
|
```
|
||
|
|
||
|
A newer solution is AbortSignal's `timeout()` static method:
|
||
|
|
||
|
```js
|
||
|
// stop after 5 seconds
|
||
|
const oembed = await extract(url, null, {
|
||
|
signal: AbortSignal.timeout(5000),
|
||
|
})
|
||
|
```
|
||
|
|
||
|
For more info:
|
||
|
|
||
|
- [AbortController constructor](https://developer.mozilla.org/en-US/docs/Web/API/AbortController)
|
||
|
- [AbortSignal: timeout() static method](https://developer.mozilla.org/en-US/docs/Web/API/AbortSignal/timeout_static)
|
||
|
|
||
|
|
||
|
### `.setProviderList()`
|
||
|
|
||
|
Apply a list of providers to use, overriding the [default](src/utils/providers.orginal.json).
|
||
|
|
||
|
#### Syntax
|
||
|
|
||
|
```ts
|
||
|
setProviderList(Array providers)
|
||
|
```
|
||
|
|
||
|
#### Parameters
|
||
|
|
||
|
##### `providers` *required*
|
||
|
|
||
|
List of providers to apply.
|
||
|
|
||
|
For example:
|
||
|
|
||
|
```js
|
||
|
import { setProviderList } from '@extractus/oembed-extractor'
|
||
|
|
||
|
const providers = [
|
||
|
{
|
||
|
provider_name: 'Alpha',
|
||
|
provider_url: 'https://alpha.com',
|
||
|
endpoints: [
|
||
|
// endpoint definition here
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
provider_name: 'Beta',
|
||
|
provider_url: 'https://beta.com',
|
||
|
endpoints: [
|
||
|
// endpoint definition here
|
||
|
]
|
||
|
}
|
||
|
]
|
||
|
|
||
|
setProviderList(providers)
|
||
|
```
|
||
|
|
||
|
Default list of resource providers is synchronized from [oembed.com](http://oembed.com/providers.json).
|
||
|
|
||
|
If you want to modify providers list, please make pull request on [iamcal/oembed](https://github.com/iamcal/oembed) then create issue/pr here to ask for sync.
|
||
|
|
||
|
|
||
|
## Facebook and Instagram
|
||
|
|
||
|
In order to work with the links from Facebook and Instagram, you need a [reviewed Facebook's app](https://developers.facebook.com/docs/app-review) with [oEmbed Read](https://developers.facebook.com/docs/features-reference/oembed-read) permission.
|
||
|
|
||
|
When seeing a link from Facebook or Instagram, `oembed-parser` will look for environment variables `FACEBOOK_APP_ID` and `FACEBOOK_CLIENT_TOKEN` to retrieve oembed data using your app credentials.
|
||
|
|
||
|
For example:
|
||
|
|
||
|
```bash
|
||
|
export FACEBOOK_APP_ID=your_app_id
|
||
|
export FACEBOOK_CLIENT_TOKEN=your_client_token
|
||
|
|
||
|
npm run eval https://www.instagram.com/tv/CVlR5GFqF68/
|
||
|
```
|
||
|
|
||
|
|
||
|
## Test
|
||
|
|
||
|
```bash
|
||
|
git clone https://github.com/extractus/oembed-extractor.git
|
||
|
cd oembed-extractor
|
||
|
npm i
|
||
|
npm test
|
||
|
```
|
||
|
|
||
|
![oembed-extractor unit test](https://i.imgur.com/Nr5BgUx.png)
|
||
|
|
||
|
|
||
|
## Quick evaluation
|
||
|
|
||
|
```bash
|
||
|
git clone https://github.com/extractus/oembed-extractor.git
|
||
|
cd oembed-extractor
|
||
|
npm i
|
||
|
npm run eval {URL_TO_PARSE_OEMBED}
|
||
|
```
|
||
|
|
||
|
## License
|
||
|
The MIT License (MIT)
|
||
|
|
||
|
---
|