PHP采集数据后,处理采集数据的函数

时间:2023年10月04日

/

来源:网络

/

编辑:佚名

函数:
function xxfseo_body($body){
  $body = preg_replace('~<(?!img)(\w+)\s+[^>]*>~i','<$1>', $body);
  $body = preg_replace("/<(iframe.*?)>(.*?)<(\/iframe.*?)>/si", "", $body);
  $body = preg_replace("/<(object.*?)>(.*?)<(\/object.*?)>/si", "", $body);
  $body = preg_replace("/<(script.*?)>(.*?)<\/script>/si", "", $body);
  $body = preg_replace("~<(|/)form([^>]*)>~i", "", $body);
  $body = preg_replace("~<input([^>]*)>~i", "", $body);
  $body = preg_replace("/<(textarea.*?)>(.*?)<\/textarea>/si", "", $body);
  $body = preg_replace("/<(botton.*?)>(.*?)<\/botton>/si", "", $body);
  $body = preg_replace("/<(select.*?)>(.*?)<\/select>/si", "", $body);
  $body = preg_replace("~<(|/)div([^>]*)>~i", "", $body);
  $body = preg_replace("~<(|/)span([^>]*)>~i", "", $body);
  $body = preg_replace("~<(|/)font([^>]*)>~i", "", $body);
  $body = preg_replace("~<(|/)a([^>]*)>~i", "", $body);
  $body = preg_replace("~<style[^>]*>(.*?)</style>~iUs", "", $body);
  $body = preg_replace("~<xml[^>]*>(.*?)</xml>~iUs", '', $body);
  $body = preg_replace("~<(|/)b>~i", "", $body);
  $body = preg_replace('~<!--(.*)-->~','', $body);
  $body = preg_replace('~<!--\[if [^\]]+\]>(.*?)<!\[endif\]-->~iUs','', $body);
  $body = preg_replace('~<(\w+)[^>]*>\s*</\\1>~Us', '', $body);
  $body = preg_replace("~[\r\n]+~",'', $body);
  $body = preg_replace("~>\s*~",'>', $body);
  $body = str_replace('</object>','', $body);
  return trim($body);
}
非常好用,适合采集后处理比较复杂的html
猜你需要

豫ICP备2021026617号-1  豫公网安备:41172602000185   Copyright © 2021-2028 www.78moban.com/ All Rights Reserved

本站作品均来自互联网,转载目的在于传递更多信息,并不代表本站赞同其观点和对其真实性负责。如有侵犯您的版权,请联系 1565229909#qq.com(把#改成@),我们将立即处理。