Solr索引:动态字段机制与实际应用 动态字段 允许Solr索引你在schema中没有明确定义的字段。
这在你发现忘记定义一个或多个字段时非常有用。动态字段通过为可以添加到Solr的文档提供一些灵活性,可以让你的应用程序不那么脆弱。
动态字段的基本概念 动态字段就像常规字段一样,只是它的名称中包含通配符。当你索引文档时,不匹配任何明确定义字段的字段可以与动态字段匹配。
基本示例 假设你的schema包含一个名为*_i
的动态字段:
1 <dynamicField name ="*_i" type ="int" indexed ="true" stored ="true" />
如果你尝试索引一个包含cost_i
字段的文档,但在schema中没有定义明确的cost_i
字段,那么cost_i
字段将使用为*_i
定义的字段类型和分析。
动态字段的组成 像常规字段一样,动态字段有以下组成部分:
常用动态字段映射 建议在schema中包含基本的动态字段映射,这些映射非常有用:
1. 基础数据类型动态字段 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 <dynamicField name ="*_i" type ="int" indexed ="true" stored ="true" /> <dynamicField name ="*_is" type ="int" indexed ="true" stored ="true" multiValued ="true" /> <dynamicField name ="*_l" type ="long" indexed ="true" stored ="true" /> <dynamicField name ="*_ls" type ="long" indexed ="true" stored ="true" multiValued ="true" /> <dynamicField name ="*_f" type ="float" indexed ="true" stored ="true" /> <dynamicField name ="*_fs" type ="float" indexed ="true" stored ="true" multiValued ="true" /> <dynamicField name ="*_d" type ="double" indexed ="true" stored ="true" /> <dynamicField name ="*_ds" type ="double" indexed ="true" stored ="true" multiValued ="true" /> <dynamicField name ="*_s" type ="string" indexed ="true" stored ="true" /> <dynamicField name ="*_ss" type ="string" indexed ="true" stored ="true" multiValued ="true" /> <dynamicField name ="*_b" type ="boolean" indexed ="true" stored ="true" /> <dynamicField name ="*_bs" type ="boolean" indexed ="true" stored ="true" multiValued ="true" /> <dynamicField name ="*_dt" type ="date" indexed ="true" stored ="true" /> <dynamicField name ="*_dts" type ="date" indexed ="true" stored ="true" multiValued ="true" />
2. 文本类型动态字段 1 2 3 4 5 6 7 8 9 10 <dynamicField name ="*_txt" type ="text_general" indexed ="true" stored ="true" /> <dynamicField name ="*_txts" type ="text_general" indexed ="true" stored ="true" multiValued ="true" /> <dynamicField name ="*_txt_en" type ="text_en" indexed ="true" stored ="true" /> <dynamicField name ="*_txt_en_split" type ="text_en_splitting" indexed ="true" stored ="true" /> <dynamicField name ="*_txt_zh" type ="text_zh" indexed ="true" stored ="true" />
3. 特殊用途动态字段 1 2 3 4 5 6 7 8 9 10 11 <dynamicField name ="*_stored" type ="string" indexed ="false" stored ="true" /> <dynamicField name ="*_indexed" type ="text_general" indexed ="true" stored ="false" /> <dynamicField name ="*_sort" type ="string" indexed ="false" stored ="false" docValues ="true" /> <dynamicField name ="*_point" type ="point" indexed ="true" stored ="true" />
高级动态字段模式 1. 前缀匹配模式 1 2 3 4 <dynamicField name ="attr_*" type ="text_general" indexed ="true" stored ="true" /> <dynamicField name ="meta_*" type ="string" indexed ="true" stored ="true" /> <dynamicField name ="custom_*" type ="text_general" indexed ="true" stored ="false" />
2. 中缀匹配模式 1 2 3 <dynamicField name ="*_temp_*" type ="float" indexed ="true" stored ="true" /> <dynamicField name ="*_config_*" type ="string" indexed ="true" stored ="true" />
3. 组合模式 1 2 3 4 <dynamicField name ="product_*_price" type ="float" indexed ="true" stored ="true" /> <dynamicField name ="user_*_score" type ="int" indexed ="true" stored ="true" /> <dynamicField name ="*_category_*" type ="string" indexed ="true" stored ="true" multiValued ="true" />
实际应用场景 1. 电商产品属性 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 <dynamicField name ="spec_*" type ="text_general" indexed ="true" stored ="true" /> <dynamicField name ="feature_*" type ="string" indexed ="true" stored ="true" /> <dynamicField name ="price_*" type ="float" indexed ="true" stored ="true" />
2. 用户配置文件 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 <dynamicField name ="profile_*" type ="text_general" indexed ="true" stored ="true" /> <dynamicField name ="setting_*" type ="string" indexed ="true" stored ="true" /> <dynamicField name ="pref_*" type ="boolean" indexed ="true" stored ="true" />
3. 多语言内容 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 <dynamicField name ="title_*" type ="text_general" indexed ="true" stored ="true" /> <dynamicField name ="content_*" type ="text_general" indexed ="true" stored ="true" /> <dynamicField name ="description_*" type ="text_general" indexed ="true" stored ="true" />
4. 时间序列数据 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 <dynamicField name ="metric_*" type ="float" indexed ="true" stored ="true" /> <dynamicField name ="count_*" type ="int" indexed ="true" stored ="true" /> <dynamicField name ="*_timestamp" type ="date" indexed ="true" stored ="true" />
动态字段的优先级和匹配规则 匹配优先级
明确定义的字段 优先级最高
更具体的动态字段 优先于通用动态字段
在schema中先定义的 动态字段优先
匹配示例 1 2 3 4 5 6 7 8 9 10 11 <field name ="title" type ="text_general" indexed ="true" stored ="true" /> <dynamicField name ="title_*" type ="text_general" indexed ="true" stored ="true" /> <dynamicField name ="*_s" type ="string" indexed ="true" stored ="true" /> <dynamicField name ="*" type ="text_general" indexed ="true" stored ="true" />
最佳实践 1. 命名约定 1 2 3 4 5 6 7 8 9 10 11 12 13 <dynamicField name ="*_i" type ="int" indexed ="true" stored ="true" /> <dynamicField name ="*_s" type ="string" indexed ="true" stored ="true" /> <dynamicField name ="*_t" type ="text_general" indexed ="true" stored ="true" /> <dynamicField name ="*_is" type ="int" indexed ="true" stored ="true" multiValued ="true" /> <dynamicField name ="*_ss" type ="string" indexed ="true" stored ="true" multiValued ="true" /> <dynamicField name ="*_en" type ="text_en" indexed ="true" stored ="true" /> <dynamicField name ="*_zh" type ="text_zh" indexed ="true" stored ="true" />
2. 性能优化配置 1 2 3 4 5 <dynamicField name ="*_facet" type ="string" indexed ="true" stored ="false" docValues ="true" /> <dynamicField name ="*_sort" type ="string" indexed ="false" stored ="false" docValues ="true" /> <dynamicField name ="*_search" type ="text_general" indexed ="true" stored ="false" /> <dynamicField name ="*_display" type ="string" indexed ="false" stored ="true" />
3. 防御性配置 1 2 3 <dynamicField name ="*_ignored" type ="ignored" multiValued ="true" /> <dynamicField name ="*" type ="text_general" indexed ="true" stored ="true" />
4. 模块化组织 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 <dynamicField name ="*_i" type ="int" indexed ="true" stored ="true" /> <dynamicField name ="*_l" type ="long" indexed ="true" stored ="true" /> <dynamicField name ="*_f" type ="float" indexed ="true" stored ="true" /> <dynamicField name ="*_d" type ="double" indexed ="true" stored ="true" /> <dynamicField name ="*_b" type ="boolean" indexed ="true" stored ="true" /> <dynamicField name ="*_dt" type ="date" indexed ="true" stored ="true" /> <dynamicField name ="*_txt" type ="text_general" indexed ="true" stored ="true" /> <dynamicField name ="*_txt_en" type ="text_en" indexed ="true" stored ="true" /> <dynamicField name ="*_txt_zh" type ="text_zh" indexed ="true" stored ="true" /> <dynamicField name ="*_is" type ="int" indexed ="true" stored ="true" multiValued ="true" /> <dynamicField name ="*_ss" type ="string" indexed ="true" stored ="true" multiValued ="true" /> <dynamicField name ="*_txts" type ="text_general" indexed ="true" stored ="true" multiValued ="true" /> <dynamicField name ="*_coordinate" type ="location" indexed ="true" stored ="true" /> <dynamicField name ="*_path" type ="path_hierarchy" indexed ="true" stored ="true" /> <dynamicField name ="*_currency" type ="currency" indexed ="true" stored ="true" />
注意事项 1. 性能影响
动态字段会增加schema匹配的开销
过多的动态字段定义可能影响性能
建议使用具体的字段定义替代常用动态字段
2. 维护性
动态字段可能使schema难以理解
建议在文档中清楚说明动态字段的用途
定期审查和清理不必要的动态字段
3. 类型安全
确保动态字段的类型定义正确
注意数据类型转换可能导致的问题
使用适当的字段验证机制
总结 动态字段是Solr提供的强大灵活性机制,允许处理未预定义的字段。通过合理的命名约定和配置策略,可以构建既灵活又高效的schema设计。在实际使用中,应平衡灵活性和性能需求,避免过度使用动态字段,并保持良好的文档和维护实践。