tw.myaw
Pet discovered twtxt occasionally, when glanced
at tilde.club server stats.
Pet found the idea behind twtxt
useful to talk to itself, it just needed some extension.
In the first place pet wants to use MYAW instead of plain text. MYAW looks the best format for raw source data pet needed throughout its lives.
There's no well-known tw.myaw
file like twtxt
tw.myaw
has concept of channels and a well-known file is named twchan.myaw
.
Its root object is a map with the following structure:
<channel name>:
filename: # relative name of channel file, may include directory
archive: # relative path to a directory with YYYY-MM archives
All the gibberish is stored in channel files. The root object is a mapping:
channel:
file_id: # unique identifier of channel file
about: # channel description
avatar: # channel avatar
items:
# list of items
- id: # unique item identifier (optional)
parent: # parent item identifier for replies
ts::isodate: # timestamp
source: # URL of the source if this item is fetched from somewhere
text: # the message
data: # source data in any other format
type: # JSON, Markdown, etc.
content: # the data
tags: # list of tags
media: # links to media, as in fedi, TBD
New items are always appended to the end of file and the requester may download only last changes. However, the entire file can be re-created when it goes to archive. That's why it contains file_id in the very beginning and the requester must check it against local copy. If file_id does not match, the requester moves local copy to the archive and downloads new file.
The data can be archived when the size goes beyond some limit or channel preferences get changed. Thus, there's no need to include channel info into in each post like fedi does for users.
Archive files are kept in subdirectories named YYYY-MM
.
File name has the following format:
CHANNEL-YYYYMMDD[HHMM]-YYYYMMDD[HHMM].myaw
The first date is the date/time of first record (UTC), and the second date is the date/time of last record. HHMM part is optional, it is used when there are multiple large files for the same day.
Files in the archive can be compressed. Lzma is the preferred method.
Intended use and TODO:
twtxt
derived fromtw.myaw
- collect fedi timelines into
tw.myaw
, group and display by tags, find frequent/rare words/ngrams - collect
tw.txt
from other sources - an interface to post to
tw.myaw
and to fedi