README_API.md 21 KB

:scroll: Simple Html Dom Parser for PHP

DomParser API

find findMulti findMultiOrFalse findOne
findOneOrFalse fixHtmlOutput getDocument getElementByClass
getElementById getElementByTagName getElementsById getElementsByTagName
html innerHtml innerXml loadHtml
loadHtmlFile save set_callback text
xml

SimpleHtmlDomNode (group of dom elements) API

count find findMulti findMultiOrFalse
findOne findOneOrFalse innerHtml innertext
outertext text

SimpleHtmlDom (single dom element) API

childNodes find findMulti findMultiOrFalse
findOne findOneOrFalse firstChild getAllAttributes
getAttribute getElementByClass getElementById getElementByTagName
getElementsById getElementsByTagName getHtmlDomParser getIterator
getNode hasAttribute html innerHtml
innerXml isRemoved lastChild nextNonWhitespaceSibling
nextSibling parentNode previousSibling removeAttribute
setAttribute text val


find(string $selector, int|null $idx):

Find list of nodes with a CSS selector.

Parameters:

  • string $selector
  • int|null $idx

Return:

  • TODO: __not_detected__

findMulti(string $selector):

Find nodes with a CSS selector.

Parameters:

  • string $selector

Return:

  • TODO: __not_detected__

findMultiOrFalse(string $selector):

Find nodes with a CSS selector or false, if no element is found.

Parameters:

  • string $selector

Return:

  • TODO: __not_detected__

findOne(string $selector):

Find one node with a CSS selector.

Parameters:

  • string $selector

Return:

  • TODO: __not_detected__

findOneOrFalse(string $selector):

Find one node with a CSS selector or false, if no element is found.

Parameters:

  • string $selector

Return:

  • TODO: __not_detected__

fixHtmlOutput(string $content, bool $multiDecodeNewHtmlEntity): string

Parameters:

  • string $content
  • bool $multiDecodeNewHtmlEntity

Return:

  • string

getDocument(): DOMDocument

Parameters: nothing

Return:

  • \DOMDocument

getElementByClass(string $class):

Return elements by ".class".

Parameters:

  • string $class

Return:

  • TODO: __not_detected__

getElementById(string $id):

Return element by #id.

Parameters:

  • string $id

Return:

  • TODO: __not_detected__

getElementByTagName(string $name):

Return element by tag name.

Parameters:

  • string $name

Return:

  • TODO: __not_detected__

getElementsById(string $id, int|null $idx):

Returns elements by "#id".

Parameters:

  • string $id
  • int|null $idx

Return:

  • TODO: __not_detected__

getElementsByTagName(string $name, int|null $idx):

Returns elements by tag name.

Parameters:

  • string $name
  • int|null $idx

Return:

  • TODO: __not_detected__

html(bool $multiDecodeNewHtmlEntity): string

Get dom node's outer html.

Parameters:

  • bool $multiDecodeNewHtmlEntity

Return:

  • string

innerHtml(bool $multiDecodeNewHtmlEntity): string

Get dom node's inner html.

Parameters:

  • bool $multiDecodeNewHtmlEntity

Return:

  • string

innerXml(bool $multiDecodeNewHtmlEntity): string

Get dom node's inner xml.

Parameters:

  • bool $multiDecodeNewHtmlEntity

Return:

  • string

loadHtml(string $html, int|null $libXMLExtraOptions): DomParserInterface

Load HTML from string.

Parameters:

  • string $html
  • int|null $libXMLExtraOptions

Return:

  • \DomParserInterface

loadHtmlFile(string $filePath, int|null $libXMLExtraOptions): DomParserInterface

Load HTML from file.

Parameters:

  • string $filePath
  • int|null $libXMLExtraOptions

Return:

  • \DomParserInterface

save(string $filepath): string

Save the html-dom as string.

Parameters:

  • string $filepath

Return:

  • string

set_callback(callable $functionName):

Parameters:

  • callable $functionName

Return:

  • TODO: __not_detected__

text(bool $multiDecodeNewHtmlEntity): string

Get dom node's plain text.

Parameters:

  • bool $multiDecodeNewHtmlEntity

Return:

  • string

xml(bool $multiDecodeNewHtmlEntity, bool $htmlToXml, bool $removeXmlHeader, int $options): string

Get the HTML as XML or plain XML if needed.

Parameters:

  • bool $multiDecodeNewHtmlEntity
  • bool $htmlToXml
  • bool $removeXmlHeader
  • int $options

Return:

  • string

count(): int

Get the number of items in this dom node.

Parameters: nothing

Return:

  • int

find(string $selector, int $idx): SimpleHtmlDomNode|\SimpleHtmlDomNode[]|null

Find list of nodes with a CSS selector.

Parameters:

  • string $selector
  • int $idx

Return:

  • \SimpleHtmlDomNode|\SimpleHtmlDomNode[]|null

findMulti(string $selector): SimpleHtmlDomInterface[]|\SimpleHtmlDomNodeInterface<\SimpleHtmlDomInterface>

Find nodes with a CSS selector.

Parameters:

  • string $selector

Return:

  • \SimpleHtmlDomInterface[]|\SimpleHtmlDomNodeInterface<\SimpleHtmlDomInterface>

findMultiOrFalse(string $selector): false|\SimpleHtmlDomInterface[]|\SimpleHtmlDomNodeInterface<\SimpleHtmlDomInterface>

Find nodes with a CSS selector or false, if no element is found.

Parameters:

  • string $selector

Return:

  • false|\SimpleHtmlDomInterface[]|\SimpleHtmlDomNodeInterface<\SimpleHtmlDomInterface>

findOne(string $selector): SimpleHtmlDomNode|null

Find one node with a CSS selector.

Parameters:

  • string $selector

Return:

  • \SimpleHtmlDomNode|null

findOneOrFalse(string $selector): false|\SimpleHtmlDomNode

Find one node with a CSS selector or false, if no element is found.

Parameters:

  • string $selector

Return:

  • false|\SimpleHtmlDomNode

innerHtml(): string[]

Get html of elements.

Parameters: nothing

Return:

  • string[]

innertext(): string[]

alias for "$this->innerHtml()" (added for compatibly-reasons with v1.x)

Parameters: nothing

Return:

  • string[]

outertext(): string[]

alias for "$this->innerHtml()" (added for compatibly-reasons with v1.x)

Parameters: nothing

Return:

  • string[]

text(): string[]

Get plain text.

Parameters: nothing

Return:

  • string[]

childNodes(int $idx): SimpleHtmlDomInterface|\SimpleHtmlDomInterface[]|\SimpleHtmlDomNodeInterface|null

Returns children of node.

Parameters:

  • int $idx

Return:

  • \SimpleHtmlDomInterface|\SimpleHtmlDomInterface[]|\SimpleHtmlDomNodeInterface|null

find(string $selector, int|null $idx): SimpleHtmlDomInterface|\SimpleHtmlDomInterface[]|\SimpleHtmlDomNodeInterface<\SimpleHtmlDomInterface>

Find list of nodes with a CSS selector.

Parameters:

  • string $selector
  • int|null $idx

Return:

  • \SimpleHtmlDomInterface|\SimpleHtmlDomInterface[]|\SimpleHtmlDomNodeInterface<\SimpleHtmlDomInterface>

findMulti(string $selector): SimpleHtmlDomInterface[]|\SimpleHtmlDomNodeInterface<\SimpleHtmlDomInterface>

Find nodes with a CSS selector.

Parameters:

  • string $selector

Return:

  • \SimpleHtmlDomInterface[]|\SimpleHtmlDomNodeInterface<\SimpleHtmlDomInterface>

findMultiOrFalse(string $selector): false|\SimpleHtmlDomInterface[]|\SimpleHtmlDomNodeInterface<\SimpleHtmlDomInterface>

Find nodes with a CSS selector or false, if no element is found.

Parameters:

  • string $selector

Return:

  • false|\SimpleHtmlDomInterface[]|\SimpleHtmlDomNodeInterface<\SimpleHtmlDomInterface>

findOne(string $selector): SimpleHtmlDomInterface

Find one node with a CSS selector.

Parameters:

  • string $selector

Return:

  • \SimpleHtmlDomInterface

findOneOrFalse(string $selector): false|\SimpleHtmlDomInterface

Find one node with a CSS selector or false, if no element is found.

Parameters:

  • string $selector

Return:

  • false|\SimpleHtmlDomInterface

firstChild(): SimpleHtmlDomInterface|null

Returns the first child of node.

Parameters: nothing

Return:

  • \SimpleHtmlDomInterface|null

getAllAttributes(): string[]|null

Returns an array of attributes.

Parameters: nothing

Return:

  • string[]|null

getAttribute(string $name): string

Return attribute value.

Parameters:

  • string $name

Return:

  • string

getElementByClass(string $class): SimpleHtmlDomInterface[]|\SimpleHtmlDomNodeInterface<\SimpleHtmlDomInterface>

Return elements by ".class".

Parameters:

  • string $class

Return:

  • \SimpleHtmlDomInterface[]|\SimpleHtmlDomNodeInterface<\SimpleHtmlDomInterface>

getElementById(string $id): SimpleHtmlDomInterface

Return element by "#id".

Parameters:

  • string $id

Return:

  • \SimpleHtmlDomInterface

getElementByTagName(string $name): SimpleHtmlDomInterface

Return element by tag name.

Parameters:

  • string $name

Return:

  • \SimpleHtmlDomInterface

getElementsById(string $id, int|null $idx): SimpleHtmlDomInterface|\SimpleHtmlDomInterface[]|\SimpleHtmlDomNodeInterface<\SimpleHtmlDomInterface>

Returns elements by "#id".

Parameters:

  • string $id
  • int|null $idx

Return:

  • \SimpleHtmlDomInterface|\SimpleHtmlDomInterface[]|\SimpleHtmlDomNodeInterface<\SimpleHtmlDomInterface>

getElementsByTagName(string $name, int|null $idx): SimpleHtmlDomInterface|\SimpleHtmlDomInterface[]|\SimpleHtmlDomNodeInterface<\SimpleHtmlDomInterface>

Returns elements by tag name.

Parameters:

  • string $name
  • int|null $idx

Return:

  • \SimpleHtmlDomInterface|\SimpleHtmlDomInterface[]|\SimpleHtmlDomNodeInterface<\SimpleHtmlDomInterface>

getHtmlDomParser(): HtmlDomParser

Create a new "HtmlDomParser"-object from the current context.

Parameters: nothing

Return:

  • \HtmlDomParser

getIterator(): SimpleHtmlDomNodeInterface<\SimpleHtmlDomInterface>

Retrieve an external iterator.

Parameters: nothing

Return:

  • \SimpleHtmlDomNodeInterface<\SimpleHtmlDomInterface> <p> An instance of an object implementing <b>Iterator</b> or <b>Traversable</b> </p>

getNode(): DOMNode

Parameters: nothing

Return:

  • \DOMNode

hasAttribute(string $name): bool

Determine if an attribute exists on the element.

Parameters:

  • string $name

Return:

  • bool

html(bool $multiDecodeNewHtmlEntity): string

Get dom node's outer html.

Parameters:

  • bool $multiDecodeNewHtmlEntity

Return:

  • string

innerHtml(bool $multiDecodeNewHtmlEntity): string

Get dom node's inner html.

Parameters:

  • bool $multiDecodeNewHtmlEntity

Return:

  • string

innerXml(bool $multiDecodeNewHtmlEntity): string

Get dom node's inner html.

Parameters:

  • bool $multiDecodeNewHtmlEntity

Return:

  • string

isRemoved(): bool

Nodes can get partially destroyed in which they're still an actual DOM node (such as \DOMElement) but almost their entire body is gone, including the nodeType attribute.

Parameters: nothing

Return:

  • bool true if node has been destroyed

lastChild(): SimpleHtmlDomInterface|null

Returns the last child of node.

Parameters: nothing

Return:

  • \SimpleHtmlDomInterface|null

nextNonWhitespaceSibling(): SimpleHtmlDomInterface|null

Returns the next sibling of node and it will ignore whitespace elements.

Parameters: nothing

Return:

  • \SimpleHtmlDomInterface|null

nextSibling(): SimpleHtmlDomInterface|null

Returns the next sibling of node.

Parameters: nothing

Return:

  • \SimpleHtmlDomInterface|null

parentNode(): SimpleHtmlDomInterface

Returns the parent of node.

Parameters: nothing

Return:

  • \SimpleHtmlDomInterface

previousSibling(): SimpleHtmlDomInterface|null

Returns the previous sibling of node.

Parameters: nothing

Return:

  • \SimpleHtmlDomInterface|null

removeAttribute(string $name): SimpleHtmlDomInterface

Remove attribute.

Parameters:

  • string $name <p>The name of the html-attribute.</p>

Return:

  • \SimpleHtmlDomInterface

setAttribute(string $name, string|null $value, bool $strictEmptyValueCheck): SimpleHtmlDomInterface

Set attribute value.

Parameters:

  • string $name <p>The name of the html-attribute.</p>
  • string|null $value <p>Set to NULL or empty string, to remove the attribute.</p>
  • bool $strictEmptyValueCheck </p> $value must be NULL, to remove the attribute, so that you can set an empty string as attribute-value e.g. autofocus="" </p>

Return:

  • \SimpleHtmlDomInterface

text(): string

Get dom node's plain text.

Parameters: nothing

Return:

  • string

val(string|string[]|null $value): string|string[]|null

Parameters:

  • string|string[]|null $value <p> null === get the current input value text === set a new input value </p>

Return:

  • string|string[]|null