I know this question has been asked a million times but Can someone please explain to me what ADT exactly means (in layman's terms if possible) ?
I read this definition of ADT- ADT only mentions what operations are to be performed but not how these operations will be implemented.
So is the case with primitive data types.
Suppose if we have a float data type, we know that multiplication, division, etc. operations can be performed (so we know what operations will be performed) but we don't how it'll be performed (in case of multiplication we can just multiply or repeatedly add, so we have two processes giving the same result and therefore it's abstract). So both data types are essentially the same. (I know it's incorrect).
I know I'm getting it all wrong. Can someone please help me clear this concept?
Data types are classification of data in any programming language - for example integers, characters, floats etc.
Abstract Data type is a theoretical concept. An abstract data type (ADT) is a mathematical model for data types where a data type is defined by its behaviour (semantics) from the point of view of a user of the data, specifically in terms of possible values, possible operations on data of this type, and the behaviour of these operations.A set of data values and associated operations that are precisely specified independent of any particular implementation. Abstract data type (ADT) is a specification of a set of data and the set of operations that can be performed on the data.
For Example : Stack is an Abstract Data Type. A stack ADT can have the operations push, pop, peek. These three operations define what the type can be irrespective of the language of the implementation.
So we can say, Primitive data types are a form of Abstract data type. It is just that they are provided by the language makers and are very specific to the language. So basically there are 2 types of data types primitives and user defined. But both of them are abstract data types. I hope this makes it clear.
Abstract Data Types - These are the building blocks for manipulating data that would otherwise just be a string of 1s and 0s. Would you rather have liquid metal, or nuts and bolts to build with?
It is long since this question was asked but I have an answer that might be a little bit clearer.
Data Type - Deals with a set of values, their representation and a set of operations that can be applied to them.
Abstract Data Type (ADT) - Deals with a set of values and a set of operations that can be performed on them.
The difference between the two is that ADT is not concerned with the representation of these values.
Related
I am researching about hash tables and hash maps, everything I have read or watched gives a very vague description of the differences. From messing around on Netbeans with them both, they seem to have the same functions and do the same things, what are the fundamental differences between these two data structures?
There are no differences, but you can find that the same thing called differently in different programming languages, so how people call something depends on their background and programming language they use. For example: in c++ it will be HashMap and in java it will be HashTable.
Also, there could be one difference concluded based on the naming: HashTable allows only store hashed keys, but not values whereas HashMap allows to retrieve a value by hashed key. Internally the both will use the same algorithm and can be considered as same data structure.
HashTable sounds to me like a concrete data structure, although it has numerous variants depending on what happens when a collision occurs, when the table fills up, when it empties.
Map sounds like a abstract data structure, something defined by the available operations (Dictionary would be a potential other name for the same data structure, but I'd not be surprised if some nomenclature defined both with a nuance somewhere).
HashMap sounds like an implementation of the Map abstract data structure using an HashTable concrete data structure.
Again, I'd not be surprised if a language or a library provided both, with a nuance somewhere (HashMap for instance could provide only the operations defined for a Map, but HashTable provides everything which make sense for an HashTable).
I am new to data structures in computer science. I am trying to find out about all the types of implementations of lists. I started with dynamic arrays and I wanted to know if it is possible to have different types of primitive data types in a dynamic array data structure.
I though that "dynamic" only means that you can remove, insert and add to your array without caring about its size. But do you have to care for the types of elements that there are in the array too ?
The term you are searching for is heterogeneous repectivelly homogenous. Heterogenous lists can store different kind of elements, while homogenous lists are limited to one type of elements.
Python is a good example for heterogeneous lists. This is implemented by storing references to the different objects in the list. So from a technical point of view, they store homogenous references, but from a user perspective they store different types, such as integer, strings, and other objects.
The term dynamic data structure only refers to its size/structure in runtime, as in it can change on runtime.
So for example, in C++ an array is a static data structure, whereas a vector or ordered_set is probably what you might call dynamic.
By having multiple data types in a data structure, what you are referring to is a dynamically typed language.
Any data structure will support multiple elements in it if the language is dynamically typed, such as python. The data structure itself need not be strictly dynamic for that to happen.
I am aware that there are algorithms (and even tools) to transform relational databases (RDBMS) to Graph databases, and the other way around.
I do have several questions that are a bit larger than that:
Is there a common-practice working algorithm out there for such transformation, for example RDBMS => graph (or several)?
Is this algorithm bijective? To be more precise:
2.1. Given said algorithm, is the transformation RDBMS => graph injective (one-to-one)? More plainly, can there be any two relational DBs that can be transformed into the same Graph DB?
2.2. Similarly, is any Graph DB can be represented by a relational DB? Basically, I'm asking if the algorithm function is surjective (onto)?
TL;DR
There's typically an obvious bijection from a particular math notion of graph (node set, edge relation) to a relational representation. Essentially because the math uses sets and relations.
There's no standard graph DBMS. And no standard way to use one to represent application/business situations. So there's no standard mapping between a graph database state & a relational state, let alone one that gives a representation in the other that is natural for the situations represented.
Without relation-valued attributes, mappings are not always bijective between non-relational structures and relational structures because we must sometimes pick relational surrogate values 1:1 with the relation values we would have used.
Sometimes we're not interested in a particular situation, we are just interested in a data structure. Then we can come up with (various) relational versions of it.
But a database or data structure variable typically represents an application/business situation. There is typically a one-to-many or one-to-one mapping from situations to representations. Under the relational model, every table has an associated (characteristic) predicate (statement template) and holds the rows that make a true proposition (statement) from its predicate. Other data structures are used in an ad hoc way to represent a situation.
What's special about the relational model is that you can generically query via predicate logic and/or relation operators--a query expression determines a predicate and its result holds the rows that make a true proposition from its predicate. (Calculated with certain complexity guarantees and certain opportunities for automated optimization.)
Mappings between structures that represent the same situation depend on how the databases represent situations. So there is no general mapping between representations, even for two representations using the same data structure.
On the other hand you can define some generic mapping between two structures, and it might be bijective, but when a situation is represented by one, the other tells you about the other representation of the situation, hence the situation only indirectly, not the situation itself directly. So don't expect the relational version that describes the other structure's representaion to be anything like a good relational design for that application/business.
This is the problem with ORMs & object databases. You can define a mapping from a particular object-oriented state to relations but the relations are only describing the object-oriented state, not its represented situation. Every time an object value holds an oid to an object referenced rather than contained, that referencing object is representing a relationship/association entity instance. But usually there is no explicit predicate given for the relation corresponding to the set of such objects. Instead we are given a representation function from some entire representing state to a represented situation. Whereas in a relational design every superkey value of every table (base or query result) is 1:1 with some (possibly associative) entity.
I hear many people referring tree as a data structure. But trees are mostly implemented using Linked Lists, or Arrays. So does it make it an abstract data type?
Given a type of structure, how can we determine whether it is a data structure or abstract data type?
If you are talking about a general Tree without specifying its implementation or any underlying data structure used, itself is an Abstract Data Type(ADT). ADT is any data type that doesn't specify its implementation.
But once you start talking about a concrete Tree with specific implementation using Linked List or Arrays, then that kind of concrete tree is a data structure.
With the above out of the way, the following may help you clear other confusions related to your question. Correct me if I'm wrong!
Data Type
The definition of data type from Wikipedia:
A data type or simply type is a classification identifying one of various types of data.
Data type is only a classification of data. It doesn't have any specifications about how those data are implemented. IMHO, data type is only a theoretical concept.
For example, any real number can be of the data type real. But along with integers, they can both be classified as a numeric data type, say number.
As I just pointed out, ADT is one kind of data type. But whether string, int can be considered as ADTs?
The answer is both yes and no.
Yes, because programming languages can have many ways to implement string and int ; but on one condition that through out all programming languages, these data types must share consistent properties.
No, because these primitive data types are not as abstract as stacks or queues. Since these data types seldom share consistent properties in every programming language, users of them must know the underlying problems like arithmetic overflow, etc.. Two languages may both have the int data type, but one ranges up to infinity and the other up to 2^32. This kind of technical detail must-knows is not what ADTs have promised. Let's look at stacks instead. In every programming language, stack can promise you with consistent procedures like pop, push. No other details on implementation level you should know about them, you just use them however you like it in every language.
Data Structure
Let's see the definition of data structure from Wiki:
A data structure is a particular way of organizing data in a computer so that it can be used efficiently.
As you can see, data structure is all about implementations. It is not conceptual but concrete. In my opinion, every piece of data in a program can by definition be considered as a data structure. A string can. An int can. And a whole bunch of other things like LinkedList_Stack or Array_Stack are all data structures.
Some of you might argue why int is a data structure? It's a data structure in a lower level from a programming language's author's view. Because programming languages can have many ways storing an int data type in a computer. The most common solution is two's complement, other alternatives are offset binary and ones' complement etc. However, from a user's view, we see int as the primitive data type which a programming language offers out of the box, we don't care its implementation. It's just the building block of one programming language. So for us users, any data constructed by these building blocks(primitive data types) of a programming language is more like a data structure. While for authors of programming languages, the building blocks are some lower level machine code, so for them int is definitely a data structure.
Put simply, whether one thing is a data structure or not really depends on how we look at it.
Via google:
In computer science, an abstract data type (ADT) is a mathematical
model for a certain class of data structures that have similar
behavior
So clearly, it is both.
I am wondering whether there exists any declarative language for arbitrarily describing the format and semantics of a data structure, that can be compiled to a specific implementation of that structure in any of a set of target languages. That is, something like a generic data definition language but geared toward describing arbitrary data structures such as vectors, lists, trees, etc., and the semantics of operations on those structures. I ask because I had an idea for a feasible implementation of this concept, and I'm just wondering whether it's worth it, and, consequently, whether it's been done before.
Another, slightly more abstract question: is there any real difference between the normative specification of a data structure (what it does) and its implementation (how it does it)? More specifically, should separate implementations of the same requirements be considered different structures?
If you felt like it, a combination of XML with XSLT could describe a data structure, and provide a matching definition in essentially any language if your choice. I've never tried to prove it formally, but my first guess would be that S-expressions are a superset of XML (modulo syntactical differences).
At least in theory, yes there are (or at least can be) differences between a description of what a data structure does, and how it does it. For an obvious example, you could describe a generic mapping from keys to values, which could use an implementation based on hash tables, skip lists, binary search trees, etc. It's mostly a question of describing it at a high enough level of abstraction to allow differences in the implementation. If you include too many requirements (complexity, ordering, etc.) you can pretty quickly rule out many implementations.
You may be interested in messaging specification / data serialization languages such as Google's Protocol Buffers as well as ASN.1. It's a slightly different slant than you're looking for, but in the same vein.
Both are ways of declaring generic messages for communications. Protocol buffers message specs "compile" to different languages, but the central protocol is consistent. ASN.1 has multiple different compilation utilities as well as different protocol representations staying logically consistent with varying literal implementations. Look at XER, PER vs. BER for example.
I'd love a specification language that would just focus on simple packed binary layout against a logical memory structure. It may be that plain C structs are the simplest common way of expressing this. I had hoped ASN.1 had some way of getting to that, but after looking at it for a bit, ASN.1 PER is close, but not quite it.
Edit: Apache Thrift and Capn' Proto may also be interesting.
There are approaches to that sort of thing in dynamic logics, which attempt to capture the semantics of programs. However, the meaning in terms of the dynamic logic is in terms of preconditions and postconditions and is agnostic with regard to the actual implementation of the list.
These data structures are inherently tied to implementation, as the only difference between a linked list and array is specifically how it is laid out in memory.
For this, there is a generic data definition language --- any high level programming language -- C, C++, java -- that specifies this. Any of them is as generic as the other, since within this context any of them could be compiled to the other.
Cozy is “a tool that synthesizes data structure implementations from very high-level specifications” and seems to be essentially the language I was actually looking for (or considering writing) when I asked this question.
It can automatically generate an implementation (in Java or C++, at the time of this writing) from a data structure specification written in its proprietary language. A specification describes the abstract state, update operations, and query operations of a data structure, as well as invariants that must be maintained and assumptions that can be used by the solver to optimise the implementation. For example, here is a partial specification for a graph data structure:
Graph:
handletype Node = { id : Int }
handletype Edge = { src : Int, dst : Int }
state nodes : Bag<Node>
state edges : Bag<Edge>
// Invariant: disallow self-edges.
invariant (sum [ 1 | e <- edges, e.val.src == e.val.dst ]) == 0;
op addNode(n : Node)
nodes.add(n);
op addEdge(e : Edge)
assume e.val.src != e.val.dst;
edges.add(e);
query out_degree(nodeId : Int)
sum [ 1 | e <- edges, e.val.src == nodeId ]
// …
Its implementation is described in Fast Synthesis of Fast Collections and Generalized Data Structure Synthesis by Calvin Loncaric, Emina Torlak, and Michael D. Ernst.