新书推介:《语义网技术体系》
作者:瞿裕忠,胡伟,程龚
   XML论坛     W3CHINA.ORG讨论区     计算机科学论坛     SOAChina论坛     Blog     开放翻译计划     新浪微博  
 
  • 首页
  • 登录
  • 注册
  • 软件下载
  • 资料下载
  • 核心成员
  • 帮助
  •   Add to Google

    >> 最新的技术动态
    [返回] 中文XML论坛 - 专业的XML技术讨论区休息区『 最新动态 & 业界新闻 』 → Analysis 2009: Semantics continues to not be RDF, but enrichment, classification and taxonomy 查看新帖用户列表

      发表一个新主题  发表一个新投票  回复主题  (订阅本版) 您是本帖的第 20904 个阅读者浏览上一篇主题  刷新本主题   树形显示贴子 浏览下一篇主题
     * 贴子主题: Analysis 2009: Semantics continues to not be RDF, but enrichment, classification and taxonomy 举报  打印  推荐  IE收藏夹 
       本主题类别: Ontology Language | RDF/RDFS | Web Services | XML文档存取技术(DOM, SAX)    
     admin 帅哥哟,离线,有人找我吗?
      
      
      
      威望:9
      头衔:W3China站长
      等级:计算机硕士学位(管理员)
      文章:5255
      积分:18406
      门派:W3CHINA.ORG
      注册:2003/10/5

    姓名:(无权查看)
    城市:(无权查看)
    院校:(无权查看)
    给admin发送一个短消息 把admin加入好友 查看admin的个人资料 搜索admin在『 最新动态 & 业界新闻 』的所有贴子 点击这里发送电邮给admin  访问admin的主页 引用回复这个贴子 回复这个贴子 查看admin的博客楼主
    发贴心情 Analysis 2009: Semantics continues to not be RDF, but enrichment, classification and taxonomy

    http://broadcast.oreilly.com/2009/01/analysis-2009-semantics-contin.html

    By Kurt Cagle January 6, 2009

    Within the realm of computational semantics, there is still a fairly broad disconnect between triple pair semantics, the use of RDF (or turtle notation) to create atomic assertions, and the realm of semantics as reflected on the web. I do not expect this to change much in 2009, save perhaps that the gulf between the two will likely just get wider.

    While I think that RDF (or more likely a successor to that set of specifications) will eventually go on to becoming the overall semantic tier of the web, its rather depressing just how far we really are from RDF actually becoming widely adopted. Instead, the approaches themselves are still running largely into the proprietary realm, though there are a few interesting areas that should be watched fairly closely.

    One of the open source projects that received a fair amount of buzz at OSCON 2008 was Freebase (http://www.freebase.com), a site which mines Wikipedia in order to establish an extraordinarily comprehensive RDF database about topics gleaned from the linkages found within Wikipedia. One benefit of this is that you can do such queries as "List all movies containing aliens", and with the tools at hand, it will show content that matches this particular query. This in turn makes it possible to create relational queries on Wikipedia data, making it particularly useful as a research and data mining tool.

    My sense is that you'll see more of these types of applications showing in 2009, apps built around the power of RDF (and increasingly SparQL, the RDF/OWL query language). However, its also likely that few, if any, of these sites are likely to tout their Semantic Web credentials (or even acknowledge that this is what is going on under the hood).

    One area that I feel is poised to really take off in the next year is content enrichment. Enrichment involves taking a collection of text, running a series of rules and contextual filters on the data looking for names, events and patterns, then encasing this content within specialized XML markup. Depending upon the database, the source, and the service agreements involved, such enrichment performs an invaluable service in being able to establish the context of a given phrase within an article, and by extension being able to provide both an abstract of the content and specialized search looking for meta-content within a document.

    For instance, an article about Barack Obama and John McCain could be abstracted as being about the presidential contest, while an article about Barack Obama and George Bush might talk about transitions of power from one president to the next, with specific terms for each of these people (and related people determined by this context, highlighted as tagged content).

    Again this is a service that both commercial and open source XML databases and content management systems are beginning to provide to their customers, and this also illustrates what is increasingly becoming the norm in business applications, situations where critical processing of data streams are applied through web services by third party providers.

    This is an area that is ripe for standardization. I suspect that the RDF crowd will probably be jumping up and down at this stage screaming "Use RDF! Use RDF!!" but I'm not really sure that will end up happening, at least not directly. I wrote last year about CURIEs and RDFa, which is an attribute-carried RDF descriptor language for text content, and with the specification now made into a full Recommendation (as of October, 2008), I suspect that it may start making its way in as an alternative offered format by many vendors, which raises the very real possibility that it could become the de facto standard for enrichment (or form the foundation for same) by late 2010.

    My central problem with RDF is that it is a brilliant technology that tried to solve too big a problem too early on by establishing itself as a way of building "dynamic" ontologies. Most ontologies are ultimately dynamic, changing and shifting as the requirements for their use change, but at the same time such ontologies change relatively slowly over time.

    This means that the benefit of specifying a complex RDF Schema on an ontology - which can be a major exercise in hair pulling - is typically only advantageous in the very long term for most ontologies, and that in general the flexibility offered by RDF in that regard is much like trying to build a skyscraper out of silly putty. It's possible to do so (maybe), but the drawbacks in the increased complexity of code (especially given that most people are still having trouble understanding the relatively simple syntax of XPath) makes it a dubious proposition at best except for those highly interconnected information spaces with comparatively few constraints acting on it such as Freebase.

    What I see happening instead is that there should be fairly significant consolidation of specifications down to a few consortia standards in any given domain - such as XRBL in business reporting, HL7 in health care, S1000D in airline specifications and so forth. Even five years ago, most industries tended to have two or more distinct standards competing for adoption, but in the last year many of these dual standard industries have either settled on one or merged these two standards together. Thus I see 2009 being devoted towards application development around an industry's preferred vertical ... with opportunities especially for those who work in developing such standards in the first place.

    In other words, its very likely that in order for the RDF/Semantic Web approach to gain credence in these spaces, ontologists will have to start with these specific industry schemas and develop RDF-based tools that model them. Given that I see XML databases increasingly carrying the load in working with these schemas, this will also likely result, at some point in the not too distant future, of a need for a meeting of the minds between the XQuery working group and the SparQL working group in order to develop a SparQL analog that can be run in XQuery, probably as a set of optional modular extensions to the language. I don't know if this is on the agenda at the W3C yet, though if its not, then its likely we won't see significant traction there until 2011 at the earliest.

    The other area where there's been something of a "small s" semantic revolution has been the growing awareness of the intimate link between web navigation and knowledge navigation among both web developers and semantics specialists. As web sites grow, they become more complex, deeper, and far more difficult to maintain in terms of their underlying structure.

    Ultimately this comes down to a question of classification and partition of the topics within the site itself, and this in turn points to a potential semantic solution for managing large and topically interconnected content. The folksonomy "tagging" revolution (which I think is probably running out of steam) was a significant first step, but folksonomies are by their nature unstructured and poorly regulated.

    I think this is going to be the year that a lot of both web design and web framework support is going to embrace semantic tools and concepts (the inclusion of RDF support within the taxonomy-heavy Drupal system is a good case in point).


       收藏   分享  
    顶(0)
      




    ----------------------------------------------

    -----------------------------------------------

    第十二章第一节《用ROR创建面向资源的服务》
    第十二章第二节《用Restlet创建面向资源的服务》
    第三章《REST式服务有什么不同》
    InfoQ SOA首席编辑胡键评《RESTful Web Services中文版》
    [InfoQ文章]解答有关REST的十点疑惑

    点击查看用户来源及管理<br>发贴IP:*.*.*.* 2009/1/9 22:14:00
     
     GoogleAdSense
      
      
      等级:大一新生
      文章:1
      积分:50
      门派:无门无派
      院校:未填写
      注册:2007-01-01
    给Google AdSense发送一个短消息 把Google AdSense加入好友 查看Google AdSense的个人资料 搜索Google AdSense在『 最新动态 & 业界新闻 』的所有贴子 点击这里发送电邮给Google AdSense  访问Google AdSense的主页 引用回复这个贴子 回复这个贴子 查看Google AdSense的博客广告
    2024/5/2 3:16:39

    本主题贴数1,分页: [1]

    管理选项修改tag | 锁定 | 解锁 | 提升 | 删除 | 移动 | 固顶 | 总固顶 | 奖励 | 惩罚 | 发布公告
    W3C Contributing Supporter! W 3 C h i n a ( since 2003 ) 旗 下 站 点
    苏ICP备05006046号《全国人大常委会关于维护互联网安全的决定》《计算机信息网络国际联网安全保护管理办法》
    4,449.219ms