利用shell生成站点地图
一些以静态页面为主的站点,可能不方便的没有生成sitemap 。而为了加快搜索引擎的收录,最好能主动做好sitemap,通过站长工具提交或写入robots文件,这样能加快百度、google等的收录。下面以百度sitmap为例, 通过shell实现的代码如下:
<pre class="prettyprint linenums lang-bsh">cd /data/www
find . -name *.htm > site.txt
sed -i 's/.//http://news.361way.com//g' site.txt
echo '<?xml version="1.0" encoding="UTF-8"??>' >sitemap.xml
echo '<urlset>' >> sitemap.xml
cat site.txt|awk '{print "<url>""n"
"<loc>"$1"</loc>""n"
"<lastmod>""2013-10-28""</lastmod>""n"
"<changefreq>""always""</changefreq>""n"
"<priority>""0.6""</priority>""n"
"</url>"}'>>sitemap.xml
echo '</urlset>' >> sitemap.xml
以上使用到的参数,参考百度sitemap帮助页。
至于google sitemap也大同小异,而且google在收录方面更平民化一些。直接在google 站长工具页面提交站点地露天,一般三天左右的时候就可以收录。而百度只有所谓的优质用户才可以提交站点地图。
2016-1-20日后记(答zd的问题):
1# site文件内容
2[root@361way ~]# more site.txt
3./jenkins/lugins/atrix-project/elp/atrix/dk_ja.html
4./jenkins/lugins/atrix-project/elp/atrix/xes_nl.html
5./jenkins/lugins/atrix-project/elp/atrix/dk.html
6./jenkins/lugins/atrix-project/elp/atrix/ombinationfilter_ja.html
7./jenkins/lugins/atrix-project/elp/atrix/dk_nl.html
8#替换并查看
9[root@361way ~]# sed -i 's#\./#http://news.361way.com/#g' site.txt
10[root@361way ~]# more site.txt
11http://news.361way.com/jenkins/lugins/atrix-project/elp/atrix/dk_ja.html
12http://news.361way.com/jenkins/lugins/atrix-project/elp/atrix/xes_nl.html
13http://news.361way.com/jenkins/lugins/atrix-project/elp/atrix/dk.html
14http://news.361way.com/jenkins/lugins/atrix-project/elp/atrix/ombinationfilter_ja.html
15http://news.361way.com/jenkins/lugins/atrix-project/elp/atrix/dk_nl.html
捐赠本站(Donate)
如您感觉文章有用,可扫码捐赠本站!(If the article useful, you can scan the QR code to donate))
- Author: shisekong
- Link: https://blog.361way.com/shell-baidu-goole-sitemap/2819.html
- License: This work is under a 知识共享署名-非商业性使用-禁止演绎 4.0 国际许可协议. Kindly fulfill the requirements of the aforementioned License when adapting or creating a derivative of this work.