Updates


Oct 23, 2023

Progress

The IPFS whitepaper has been read, along with several research papers describing and assessing the system in general. Notes on the system and summaries of the research papers have also been written. At this point, the general direction of the project will be as follows:

Documentation will be written about the structure and purpose of IPFS by using the whitepaper, IPFS developer docs, and research papers as sources. This will be done to condense and simplify the existing documentation.

Plan

The plan is also to do some technical research on the system itself. We will try to host an IPFS and document the process, with emphasis on the user's perspective. We will try to replicate some of the research seen in the papers, which relates to the performance of IPFS (i.e. retrieval time of objects, latency, etc). An emphasis will be placed on storing and retrieving multimedia-related files like images, video, etc.

IPFS Academic Papers

Finally, we will make an assessment of the strengths, weaknesses, and overall viability of IPFS. This will include design elements that we want to be included or want to change, and what direction the system should go in the future.

Next Steps

The next steps of the project include the following: create outline/skeleton of IPFS specification, host node, determine applicable experiments that can be run on the hardware we have, in a reasonable amount of time, host multimedia files on IPFS.


Nov 6 2023

Most of the project time has been spent on research of the IPFS system documentation, reading relevant papers, and creating notes.

Primary references: see citations

Introduction to IPFS

The goal of IPFS is to construct a single distributed filesystem which could be accessed by any computer. There is no single point of failure and nodes in the system do not need to trust eachother.[1]

HTTP has many limitations are now coming to light, with challenges like handling massive datasets and interacting with data across many organizations. The way the web was originally constructed is not suitable for all uses today, and new protocols and systems are needed. [1]

Components

IPFS is built from a selection of preexisting technologies that are combined into a modular stack.[1]

There are 4 important aspects which make IPFS work the way it does:

  1. Content-based adressing
  2. Decentralized object indexing
  3. Peer Addressing
  4. Immutability

Content Adressing [2]

When content (i.e. a file) is uploaded to IPFS, it is broken into chunks of a certain size (256kB), and each chunk is given a Content Identifier (CID).

CID has a designated structure which holds some data (like a header) and then contains the hash of the file content. Using all the chunks of a file, IPFS makes a Merkle Directed Acyclic Graph (DAG) of the file.

cid version | multicodec | length | actual hash

Obejct Indexing

The IPFS network has a mapping of CID and PeerID for a particular object so that it can serve it to a client. Mappings are stored in a Distributed Hash Table (DHT).[2]

Peer Addressing

When a peer joins the IPFS network, it generates a public-private key pair.

Each peer in the network is given and identified by a unique PeerID, which is a hash of its public key. The PeerID does not change unless it is changes manually. Its function is to ensure that the public key used to secure the communication channel between peers is the same one that identifies them.[2]

Content Immutability

CIDs are inherently permenant and immutable. Dynamic content can be specified to be published using a hash of the providers public key (i.e. PeerID) instead of the CIC. Additional constructs are used to create an immutable reference to the content which is not based on a hash of the actual data.[2]

Assessment of Previous Research

The research of the performance of IPFS done in [2] was very in-depth and took place over a period of several months. Surveying a decentrilized systems takes a significant amount of time because there are nodes and node clusters all over the world, and many are created and disappear in a matter of hours or days. This depth of data collection will not be feasible in this project, so more time will be dedicated to describing exsisting research rather than doing our own.

Project Direction

The project will consist of several parts. The first section will be a report describing the technical specifications and components of IPFS. The next section will describe the research that has been done. The third part will be my own research and testing, mostly looking at how easy and effective it is to host and distribute multimedia files over IPFS. More planning needs to go into this section. Finally, as summary will be given on the possible future of a decentralized internet.


Nov 20 2023

Progress

There are a fairly large number of research papers about eiter IPFS specifically, the sub-components within IPFS, or the general group of protocols and tools which can be used for a decentralized internet. Overall they include technical descriptions, experimental research and the results, and societal factors. The majority of my time has been reading and summarizing these papers, as well as planning the structure for the final report.

Focus

Part of the final report will include experimental results, but they will not be my own. I have realized that collecting data on a peer-to-peer network is not feasible withing the alloted time, and will very likely not give novel insights. However, research will be focused more on using IPFS for delivering multimedia content, as this is somewhat neglected in research. Some research has been done though, and I will be compiling that and providing an assessment.

Next Steps

With the majority of the reasearch done, more time will now go to drafting the final deliverables. The plan at this point is for a written report giving a technical overview of IPFS, Looking into experimental findings with some focus on delivering multimedia content, and then an assessment of the system as a possible compliment or maybe a replacement of current client-server infrastructure.