使用Jsoup去解析查询手机号归属地

jilong-liang

浏览: 471225 次
性别:
来自: 广州

最近访客更多访客>>

word5

qq243348167

tian_yu_bing

追逐什么

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

博客分类：

Jsoup

Jsoup

package com.test;

import java.io.IOException;
import java.net.URL;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

import org.apache.commons.httpclient.HttpException;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
/***
 * 使用Jsoup去解析查询手机号归属地
 * 原理：抓取再解析html...
 * @author ljl
 *
 */
public class Test2 {

	/**
	 * 正则表达式,抽取手机归属地
	 */
	public static final String REGEX_GET_MOBILE = "(?is)(<tr[^>]+>[\\s]*<td[^>]+>[\\s]*卡号归属地[\\s]*</td>[\\s]*<td[^>]+>([^<]+)</td>[\\s]*</tr>)"; // 2:from
	/**
	 * 正则表达式,审核要获取手机归属地的手机是否符合格式,可以只输入手机号码前7位
	 */
	public static final String REGEX_IS_MOBILE = "(?is)(^1[3|4|5|8][0-9]\\d{4,8}$)";

	/**
	 * 从www.ip138.com
	 * 返回的结果网页内容中获取手机号码归属地,结果为：省份 城市
	 * @param htmlSource
	 * @return
	 */
	public static String parseMobileFrom(String htmlSource){
		Pattern p=null;
		Matcher m=null;
		String result=null;
		p=Pattern.compile(REGEX_GET_MOBILE);
		m=p.matcher(htmlSource);
		
		while(m.find()){
			if(m.start(2)>0){
				result=m.group(2);
				result=result.replaceAll("&nbsp;", " ");
			}
		}
		return result;
	}
	
	/**
	 * 验证手机号
	 * @param mobileNumber
	 * @return
	 */
	public static boolean veriyMobile(String mobileNumber){
		Pattern p=null;
		Matcher m=null;
		p=Pattern.compile(REGEX_IS_MOBILE);
		m=p.matcher(mobileNumber);
		return m.matches();
	}
	public static void main(String[] args) throws Exception {
		String mobile="13800138000";
		getNetFormMobileInfo(mobile);

	}

	private static void getNetFormMobileInfo(String mobileNumber) throws IOException, HttpException {
		if(!veriyMobile(mobileNumber)){
			try {
				throw new Exception("不是完整的11位手机号或者正确的手机号前七位");
			} catch (Exception e) {
				e.printStackTrace();
			}
		}
		StringBuffer buffer = new StringBuffer();
		String url = "http://www.ip138.com";
		buffer.append(url);
		buffer.append(":8080");//端口
		buffer.append("/");
		buffer.append("search.asp?");
		buffer.append("mobile=" + mobileNumber);
		buffer.append("&action=mobile");
		 
		String basePath = buffer.toString();
		 
		Document doc=Jsoup.parse(new URL(basePath), 3000); 
		if(doc!=null){
			//从class=tdc样式下面抓取  
			 Elements tdcs = doc.getElementsByAttributeValue("class", "tdc");  
		        for(Element td:tdcs){  
		        	//从class=tdc2样式下面抓取  
		        	Elements tdc2s=td.getElementsByAttributeValue("class","tdc2");
		        	 for(Element tdc:tdc2s){
		        		 //System.out.println(tdc);
		        		 //<[^>]+>去掉html标签,去掉&nbsp;html标签的空格
		        		 String mobileInfo=tdc.select("td").html().replaceAll("<[^>]+>", "").replaceAll("&nbsp;", "").replaceAll("-->", "");
		        		 System.out.println(mobileInfo);
		        	 }
		        }  
		}else{
			System.err.println("网络异常~~");
		}
	}
}

1
顶

1
踩

分享到：

CSS3 提示框带边角popover | CSS3 圆角属性 border-radius和-webkit-bo ...

2014-04-18 10:51
浏览 959
评论(1)
分类:编程语言
查看更多

1 楼 gnomewarlock 2014-04-18

标题应该是，解析html

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

使用Jsoup去解析查询手机号归属地

评论

发表评论

相关推荐

最近访客 更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

使用Jsoup去解析查询手机号归属地

评论

发表评论

相关推荐

使用Jsoup抓取车标网各种类型相应车的信息

dom4j和jsoup解析百度地图xml获取地方信息

使用Jsoup解析XML抓取新浪新闻文章

Jsoup解析html抓取网页数据

jsoup解析某城市的XML

Jsoup 伪装请求头（转）

使用jsoup去解析历史在今天的html内容

Jsoup+json-lib解析xml带中括号的数组Json数据

Jsoup解析HTML代码标签与属性

Dom4j组装XML,Jsoup解析XML相互用

使用Jsoup和Dom4j封装jdbc连接数据库

Spring quartz定时结合Jsoup和Dom4j使用解析百度地图API

Jsoup解析百度音乐API的xml

最近访客更多访客>>