我知道在 Nokogiri 中选择第一个子元素有几十种方法,但哪种方法最便宜?
我无法使用 Node#children,这听起来非常昂贵.如果有 10,000 个子节点,而我不想碰其他 9,999 个......
解决方案Node#child 是获取第一个子元素的最快方法.
但是,如果您要查找的节点不是第一个节点,也许是第 99 个节点,那么没有比调用 children 并索引它更快的方法来选择该节点.>
如果您只想要第一个,那么为所有孩子构建一个 NodeSet 的成本很高,这是正确的.
一个限制因素是 libxml2(Nokogiri 底层的 XML 库)将节点的子节点存储为链表.因此,您需要遍历列表 (O(n)) 以选择所需的子节点.
编写一个方法来简单地返回第 n 个孩子是可行的,而无需为所有其他孩子实例化 NodeSet 甚至 Ruby 对象.我的建议是打开功能请求,或向 Nokogiri 邮件列表发送电子邮件.
I know that there are dozens of ways to select the first child element in Nokogiri, but which is the cheapest?
I can't get around using Node#children, which sounds awfully expensive. If there are 10,000 child nodes, and I don't want to touch the 9,999 others....
解决方案Node#child is the fastest way to get the first child element.
However, if the node you're looking for is NOT the first, perhaps the 99th, then there is no faster way to select that node than to call children and index into it.
You are correct in stating that it's expensive to build a NodeSet for all children if you only want the first one.
One limiting factor is that libxml2 (the XML library underlying Nokogiri) stores a node's children as a linked list. So you'll need to traverse the list (O(n)) to select the desired child node.
It would be feasible to write a method to simply return the nth child, without instantiating a NodeSet or even Ruby objects for all the other children. My advice would be to open a feature request, or send an email to the Nokogiri mailing list.