`

Data abstraction - universal/dynamic data structure

    博客分类:
  • OO*
阅读更多
Recently, I read two articles about data holders, search the key word 元数据、开放数据模型及动态系统. The scenario is that constantly we need to change the data fields, e.g., add/delete some fields. This task is tedious because usually we need to update a database table, and then add fields in all corresponding classes. The repeated steps make people think whether there is a better way. A remedy is to hold all data in key-value pairs(in a Map) throughout the system, then whenever we need a new field, just add a new key value pair to the Map and extract it whenever we needed. This is a meta level idea, similar ideas have been around for at least last 6 years, though the implementation vary.

Before we dive in, let's pull back to take a bird view on the problem to get all the aspects of the problem. The better we understand the problem, the better a solution we could find. If we look at the tedious way to change a data field, the whole process is based on two facts:

  1. In JAVA every field is strong typed. When we get/set the new data field, we have to have to some kinds of getter/setter method first.
  2. POJO are static, not dynamic like Collection classes. So every data field change requires code changes. This means the data structure in POJO is static.

In order to fix the first problem, there are two common ways. One is to use a Map and create a universal get(key) method. The other is to make the field public, then you can use dot operation.

In order to fix the second problem, a natural way is to use Collection classes.

So the Collection classes fix both #1 and #2. But are there any downsides? Well, depends. Consider the following scenarios:

  • Collection classes are not strong typed, so any object you get from it is of type Object. If we want to call something not in the Object class, we have to cast it to a particular class. So we will have quite some instancof checks and class castings. While class casting doesn't cost too much in performance, instanceof does. One way to fix this is through meta data, i.e., we save the type somewhere. However, when we are in the maintenance mode, there is no way we can track where a property is used because every property's access is through the same get(key) method. This present a hard roadblock when refactoring, and possibly is a flag for over-abstraction.
  • Use Collections in the interfaces. This could cause major integration time and troubleshooting/debug time because both sides of the interface need to agree upon the runtime data(data present or missing, or the data format issue, besides the data value issue). Though this is one of the major reasons why we want to choose Collection classes, in my experience it takes more time later on than the time we save earlier during coding.
  • Performance degrade when we simultanuously have large number of operations on insert/update/delete/retrieve.  Most of the Collection classes work fast only on certain operations and slow on other operations, but not fast on all operations. This fact makes it hard to build a universal data structure.

I just listed the cases that I am aware of, there could be other drawbacks. So we should use this approach only when we can bypass these problems, somehow. For example, in a simple db-to-web applications, we just get data from db and display on web pages. In this case, we don't care the types of fields, just use the toString() method to the output; there are few interfaces(maybe just a DAO); and we just retrieve them once and loop through.

In mid-level complex applications, however, we could have all 3 of the above scenarios and it's unlikely that the Collection classes would fit the need for maintenance and performance reasons.
分享到:
评论

相关推荐

Global site tag (gtag.js) - Google Analytics